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On Negative Alternative Questions* 
Chung-hye Han 



1 Introduction 

The question in (1) is formally a yes-no question. But in terms of its inter- 
pretation, it is ambiguous: it can have either a yes-no question reading or an 
alternative question reading. 

( 1 ) Did John drink coffee or tea? 

Under the yes-no question reading, the speaker has no presupposition as to 
whether John drank coffee or tea, and the possible answers are Yes, John drank 
coffee or tea and No, John didn ’t drink coffee or tea. Under the alternative 
question reading, the speaker presupposes that John drank either coffee or tea, 
and the possible answers are John drank coffee and John drank tea. 

The corresponding negative yes-no question can be formed in two ways: 
with n ’t as in (2a), and with not as in (2b). I will refer to the negative yes-no 
questions formed with n’t as n ’t-questions and the ones formed with not as 
not- questions. 

(2) a. Didn’t John drink coffee or tea? 

b. Did John not drink coffee or tea? 

• 

Although the questions in (2a) and (2b) have the same components, namely 
the proposition John drank coffee or tea and negation, they do not have the ex- 
act same interpretation. The question in (2b) has both the yes-no question 
reading and the alternative question reading available. Under the yes-no ques- 
tion reading, the possible answers are Yes, John drank coffee or tea and No, 
John did not drink coffee or tea. Under the alternative question reading, the 
speaker presupposes that among coffee and tea, there is a drink that John didn’t 
drink, and the possible answers are John did not drink coffee and John did not 
drink tea. On the other hand, the question in (2a) only has the yes-no question 
reading available. 

In this paper, I show that the (un)availability of the alternative question 
reading in negative yes-no questions such as (2) is a puzzle given the syntax 

*I am indebted to Maribel Romero for extensive discussions on this topic. I also 
thank the participants in the semantics of questions seminar in Spring 1999 for discus- 
sions and comments: Cassandre Creswell, Alexis Dimitriadis, Narae Han, and Alexan- 
der Williams. I also acknowledge the anonymous reviewer for very helpful comments. 

U. Penn Working Pofi^rs in Linguistics, Volume 6.3, 2000 
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of yes-no questions and the syntax of disjunction proposed in Larson (1985). 
In section 2, 1 briefly discuss Larson’s analysis of affirmative alternative ques- 
tions and extend it to negative alternative questions. It will turn out that al- 
though Larson makes correct predictions for n 7-questions, he does not do so 
for no?-questions. In sections 3 and 4, I consider two alternative syntactic 
approaches that may explain the problem at hand. In section 3, 1 modify Lar- 
son’s (1985) analysis to include LF movement of the disjunctive phrase and in 
section 4, 1 extend Schwarz’s (1999) gapping analysis on either.. .or construc- 
tions to whether... or constructions. However, I will point out problems for 
both approaches; neither can explain the interpretive asymmetry between n’t- 
questions and no?-questions. In section 5, 1 pursue a non-syntactic approach 
and suggest that (un)availability of the alternative question reading in negative 
yes-no questions should be explained by the interaction between the syntax 
and the interpretive component of the grammar. 

2 Larson (1985) 

2.1 On Affirmative Questions 

According to Larson (1985), a yes-no question has an empty operator that 
corresponds to whether. It originates from a disjunction phrase and moves 
to [Spec, CP], marking the scope of disjunction. Moreover, a yes-no question 
may have an unpronounced disjunction phrase or not. If the disjunction phrase 
from which the empty whether originates is the unpronounced or not, then the 
yes-no question reading is derived. Otherwise, the alternative question reading 
is derived. For instance, the yes-no question in (1) (repeated below as (3)) can 
have either a yes-no question reading or an alternative question reading. Under 
the yes-no question reading, the empty whether operator originates from the 
unpronounced or not and moves to [Spec, CP], as represented in (3a). This 
representation makes available the alternatives John drank coffee or tea and 
John didn’t drink coffee or tea as answers. Under the alternative question 
reading, the empty operator originates from the disjunction phrase coffee or 
tea and moves to [Spec, CP], as represented in (3b). This representation makes 
available the alternatives John drank coffee and John drank tea as answers. 
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(3) Did John drink coffee or tea? 

a. yes-no question: 

Opi (Ci or not) [did John drink [coffee or tea]] 

{John drank coffee or tea, John didn’t drink coffee or tea} 

b. alternative question: 

Opi [did John drink [ei coffee or tea]] 

{John drank coffee, John drank tea} 

Supporting evidence for the proposal that empty whether moves from a 
disjunction phrase to [Spec, CP] comes from the fact that yes-no questions that 
have a disjunction phrase inside an island do not have the alternative question 
reading available. 

(4) Do you believe the claim that Bill resigned or retired? 

a. yes-no question: 

Opi (Ci or not) [do you believe [np the claim that Bill resigned 
or retired]] 

b. * alternative question: 

Opi [do you believe [np the claim that Bill [ resigned or 
retired]]] 

In (4), the disjunctive phrase resigned or retired is inside a complex NR The 
alternative question reading is not available since the empty operator would 
have to move out of an island to generate this reading. But the yes-no question 
reading is available, since under this reading the empty operator is moving 
from the unpronounced or not, which is not inside an island. 

2.2 On Disjunction in Negative Declaratives 

Before extending Larson’s analysis to negative questions, we need to under- 
stand his treatment of disjunction scope in negative declaratives. Larson claims 
that (5) only has the reading where negation has scope over the disjunction. 
This is the reading represented in (5a), according to which John drank neither 
coffee nor tea. The reading represented in (5b), according to which John drank 
either coffee or tea, is claimed to not exist. 

(5) John did not drink coffee or tea. 

a. John did not drink Opi [e, coffee or tea]. He drank juice, (nar- 
row scope or) 

b. * Opi John did not drink [ci coffee or tea]. But I don’t know 

which, (wide scope or) 
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According to Larson, the scope of disjunction is determined by the move- 
ment of a scope indicating operator from the disjunction phrase to higher up in 
the clause. In yes-no questions, the scope indicating operator is overt or empty 
whether, and in declaratives it is either or a corresponding empty either opera- 
tor. Adopting the semantics of disjunction in Rooth and Partee (1982), Larson 
argues that a disjunctive phrase introduces a free variable that must be bound 
by the scope indicating operator that originates from the disjunctive phrase. 
This is how the scope of disjunction is marked. Larson further assumes that 
negation always introduces existential closure, which unselectively binds any 
free variable under its scope. In (5b), the empty operator cannot bind the free 
variable introduced by the disjunctive phrase because it is already bound by 
the existential closure of the intervening negation. But in (5a), the empty op- 
erator binds the free variable of the disjunctive phrase since the negation does 
not intervene between the operator and the disjunctive phrase. 

2.3 Extending Larson (1985) to Negative Questions 

Let us now apply Larson’s analysis to negative yes-no questions. We will 
see that he correctly predicts that n 'r-questions only have the yes-no question 
reading, but he wrongly predicts that the alternative question reading is not 
available for nor-questions. I repeat the questions in (2) as (6) and (7) below 
for convenience. 

In (6), the empty whether operator can move from the unpronounced or 
not phrase to [Spec, CP], deriving the yes-no question reading. This is rep- 
resented in (6a). But the empty operator cannot move from the disjunctive 
phrase coffee or tea to [Spec, CP], as in (6b). This is because the interven- 
ing negation introduces existential closure which binds the free variable of the 
disjunctive phrase, thereby blocking the empty operator from marking the dis- 
junctive scope. And thus, the alternative question reading is correctly ruled 
out. 

(6) Didn’t John drink coffee or tea? 

a. yes-no question: 

Opi (ti or not) [didn’t John drink [coffee or tea]] 

b. * alternative question: 

Opi [didn’t John drink [e* coffee or tea]] 

In (7), the yes-no question reading is derived by moving the empty opera- 
tor from the unpronounced or not to [Spec, CP], as represented in (7a). How- 
ever, under Larson’s analysis, the alternative question reading is incorrectly 
predicted to be ruled out. This is because the intervening negation between 
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the empty operator in [Spec, CP] and the disjunctive phrase would block the 
empty operator from marking the disjunctive scope, as represented in (7b), 

(7) Did John not drink coffee or tea? 

a. yes-no question: 

Opt (e* or not) [did John not drink [coffee or tea]] 

b. alternative question: 

Opi [did John not drink [ei coffee or tea]] 

3 Syntactic Approach 1: Modifying Larson (1985) 

Contrary to Larson (1985), I point out that in negative declaratives with a 
disjunctive phrase the disjunction can have scope over negation, given the right 
context. For instance, assume that my mother always bakes too many different 
kinds of pies for Thanksgiving dinner, and so every year, there are too many 
left-over pies. But this year, she decided not to make one of the pies she 
doesn’t like, namely pumpkin pies and apple pies. In this context, I can say: 

(8) For Thanksgiving dinner this year, my mother is not going to make 
a pumpkin pie or an apple pie. But I don’t know which. 

According to the native speakers that I have consulted, the first sentence in 
(8) can have the reading paraphrasable as My mother is not going to make a 
pumpkin pie or she is not going to make an apple pie. This is the wide scope 
reading of disjunction over negation. 

Further, we have already seen that in matrix negative yes-no questions 
with a disjunctive phrase, nor-questions allow the disjunction to have scope 
over negation, deriving the alternative question reading, although this was not 
possible for n 'r-questions. It turns out that in indirect negative ye5-no questions 
with a disjunctive phrase, both n’t- and nor-questions allow the disjunction to 
have scope over negation. Assume a context in which it is well known that 
John does not eat a particular type of meat for some reason, but I don’t know 
which type he doesn’t eat. So, I ask John to find out the correct information. 
In this context, both indirect questions in (9) can have the alternative question 
reading, as can be seen by the fact that both sentences in (9) can be continued 
with the phrase because I don ’t know which. 

(9) a. I asked John whether he doesn’t eat beef or chicken (because I 

don’t know which). 

b, I asked John whether he does not eat beef or chicken (because 
I don’t know which). 
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One way of deriving the interpretive representation in which disjunction 
scopes over negation is by allowing the disjunctive phrase to undergo LF 
movement. For instance, in (8), we can assume that a pumpkin pie or an apple 
pie is a generalized quantifier that can undergo QR (quantifier raising) to IP at 
LF. If it undergoes QR, then it escapes negation, and the free variable of the 
disjunction phrase will not be existentially closed, leaving it free to be bound 
by the empty operator that is higher in the clause, as represented in (10a). This 
derives the reading in which disjunction scopes over negation. On the other 
hand, if the disjunction phrase does not undergo QR, then the free variable 
is bound by the empty operator that is lower in the clause, as represented in 
(I Ob), deriving the reading in which negation scopes over the disjunction. 

(10) a. [ip Opi [ €i a pumpkin pie or an apple pie]j [jp My mother will 

not make tj]] 

b. [/ p My mother will not make Op* [ 6i a pumpkin pie or an apple 
pie]] 

Now we can apply this analysis to negative questions. The explanation for 
the availability of the yes-no question reading in (6) and (7) is trivial. These 
questions have an unpronounced or not that contributes a free variable, and 
it gets bound by the empty whether operator. As for the (un)availability of 
the alternative question reading, in (6), cojfee or tea can undergo QR to IP, 
but it cannot QR higher than negation n’t since didn’t is in C°. The variable 
introduced by disjunction would be bound by the existential closure introduced 
by negation and so the alternative question reading is ruled out. In (7), if 
coffee or tea undergoes QR to IP, then it is not under the scope of negation not 
anymore. And so, the free variable of disjunction can be bound by the empty 
whether operator, deriving the alternative question reading. 

What if the disjunction phrase is not a generalized quantifier, as in (11)? 
In (1 1), the items in disjunction are verbs. 

(11) Did John not dance or sing at the wedding? 

We can say that the disjunction V dance or Jing moves to 1° at LF. As- 
suming that negation projects below INFL, the disjunction is above negation 
at LF. Thus, the empty whether will bind the free variable of the disjunctive 
phrase and so the alternative question reading is derived. 

So far, we have seen examples in which QR and LF verb movement can 
be argued to be involved. Given that these two operations are independently 
motivated for English, the analysis that assumes LF movement of the disjunc- 
tion phrase seems attractive (cf., Chomsky 1995, May 1985). But what if the 
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disjunctive phrase is adjectival, as in (12)? Assume a context in which it is 
well-known that John didn’t date girls with a particular hair color last year. 

(12) Did John not date any blond or red haired girls last year? 

The NP any blond or red haired girls has to stay lower than negation be- 
cause the negative polarity item (NPI) any has to be licensed by negation.^ But 
we have to get the disjunction out of the scope of negation to get the alterna- 
tive question reading. But then, we would be forced to move just the adjective 
phrase blond or red. However, it is difficult to independently motivate LF 
adjective movement in English. Consequently, the analysis that assumes the 
movement of the disjunction phrase cannot be successful. 

4 Syntactic Approach 2: Gapping 

Schwarz (1999) argues that the syntax of either...or con be assimilated to the 
syntax of coordinate constructions that involve gapping. Gapping originally 
refers to the grammatical process which is responsible for the deletion of a verb 
in the second coordinate of a conjunctive coordination under identity with the 
first coordinate, as in (13) (Ross 1970). The deleted material in the second 
coordinate is called gap, and the materials in the second coordinate that have 
not been deleted are called remnants. I represent the gaps with parenthesis. 

(13) a. Tom has a pistol and Dick a sword. 

Tom has a pistol and Dick (has) a sword. (Schwarz 1999, 30a) 
b. Some ate beans and others rice. 

Some ate beans and others (ate) rice. (Schwarz 1999, 30b) 

Schwarz points out that gaps may contain more than just a verb, although 
the finite verb of the second coordinate is always included in the gap, and 
argues that this fact is comparable with the idea that either.. .or constructions 
involve gapping. 

■ ^Although yes-no questions in general license NPIs such as any and ever, alternative 
questions do not, as pointed out by Ladusaw (1980) and Higginbotham (1993). For 
instance, the question in (1 a) is ambiguous between a yes-no question and an alternative 
question, whereas the question in (lb) can only be interpreted as a yes-no question. 

(1) a. Did John play chess or checkers? 
b. Did anybody play chess or checkers? 

NPIs in alternative questions are allowed only when there is an explicit licensor such 
as negation, as shown in (12). See Higginbotham (1993) and Han and Siegel (1997) 
for an account of NPI licensing in yes-no questions and alternative questions. 
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(14) a. Bill must eat the peaches quickly and Harry slowly. 

Bill must eat the peaches quickly and Harry (must eat the peaches) 
slowly. (Schwarz 1999, 33a) 

b. * Bill must eat the peaches quickly and Harry might slowly. 

Bill must eat the peaches quickly and Harry might (eat the 
peaches) slowly. (Schwarz 1999, 30b) 

According to Schwarz, in either.. .or constructions, either marks the left 
periphery of the first disjunct, and some materials in the second disjunct are 
deleted under identity with the first disjunct. 

(15) a. John either ate rice or beans. 

John either [yp ate rice] or [yp (ate) beans] (Schwarz 1999, 
28a) 

b. Either John ate rice or beans. 

either [/p John ate rice] or [/p (John ate) beans] (Schwarz 
1999, 28b) 

One piece of supporting evidence for gapping analysis of either.. .or con- 
structions comes from what Schwarz calls dangling remnants. Dangling rem- 
nants would occur in the second conjunct of a coordinate construction if you 
were to have elision in both the first and the second conjunct. Schwarz points 
out that dangling remnants are prohibited in coordinate constructions, and 
shows that they are prohibited in eir/ie;:.. or constructions as well. 

(16) a. * Some talked about politics and others with me about music. 

some talked (with me) about politics and others (talked) with 
me about music (Schwarz 1999, 40b) 

b. * John dropped the coffee and Mary clumsily the tea. 

John (clumsily) dropped the coffee and Mary clumsily (dropped) 
the tea (Schwarz 1999, 41b) 

(17) a. ?? Either this pissed Bill or Sue off. 

either this pissed Bill (off) or (this pissed) Sue off (Schwarz 
1999, 43a) 

b. ?? Either they locked you or me up. 

either they locked you (up) or (they locked) me up (Schwarz 
1999,43c) 

Let us then apply Schwarz’s gapping analysis of either.. .or constructions 
to whether...or constructions. Whether would mark the left periphery of the 
first disjunct and some materials from the second disjunct would be deleted 
under identity with the first disjunct. We will see that this analysis makes 
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correct predictions for not questions, but not for nV questions. As in Larson 
(1985), I am assuming that direct yes-no questions have the empty whether 
operator in [Spec, CP], and that these questions can have unpronounced or 
not. In (18), the empty whether has the option of being associated with or in 
coffee or tea or with or in the unpronounced or not. If it is associated with or in 
coffee or tea, the alternative question reading is derived, and if it is associated 
with or in or not, then the yes-no question reading is derived. 

(18) Did John not drink coffee or tea? 

a. (whether) [did John not drink coffee or tea] [(or not) (did John not 
drink coffee or tea)] 

« (whether) [did John not drink coffee or tea] [(or did John drink 
coffee or tea)] 

b. (whether) [did John not drink coffee] [or (did John not drink) tea] 

In (19), the empty whether also has the option of associating with the 
or in coffee or tea and the or in the unpronounced or not. But then, both 
the alternative question reading and the yes-no question reading are wrongly 
predicted to be available for n 7 questions. But we have already seen that only 
the yes-no question reading is available for n 7-questions. 

(19) Didn’t John drink coffee or tea? 

a. (whether) [didn’t John drink coffee or tea] [(or not) (didn’t John 
drink coffee or tea) 

« (whether) [didn’t John drink coffee or tea] [(or did John drink 
coffee or tea)] 

b. (whether) [didn’t John drink coffee] [or (didn’t John drink) tea] 

In fact, Schwarz points out that gapping analysis is not appropriate for 
whether...or constructions since they allow dangling remnants, unlike either...or 
constructions and other coordinate constructions with gapping. 

(20) a. Did this piss Bill or Sue off? 

b. Did she turn the test or the homework in? 

c. Did he gulp one or two down? 

The questions in (20) can all have the alternative question reading. How- 
ever, if we were to apply the gapping analysis to these questions, then we 
would end up with dangling remnants, which were prohibited from other gap- 
ping constructions. 

Furthermore, w/ier/ie/:.. or constructions behave differently from other gap- 
ping constructions in that while remnants in gapping constructions cannot be 
in embedded finite clauses, they can be in whether.. .or constructions. 
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(21) a. * The first letter says that you should pay tax and the second letter 

V.A.T. 

[the first letter says that you should pay tax] and [the second 
letter (says [that you should pay) V.A.T]] (Schwarz 1999, 61a) 
b. ?? Either Bill said that Mary was drinking or playing video games. 
Either [Bill said that Mary was drinking] [or (Bill said [that 
Mary was) playing video games]) 

(22) a. Did John say that Bill retired or resigned? 

b. Did John claim that Bill drank coffee or tea? 

The questions in (22) all have the alternative question reading available. 
If this reading was derived via gapping in the second disjuncts in (22), then 
the remnants would be in embedded finite clauses. But this was impossible in 
other gapping constructions. 

5 A Non-Syntactic Approach 



We have so far considered and rejected two alternative syntactic approaches to 
account for the interpretive asymmetry between n Y-questions and /io?-questions 
exemplified in (2). One approach was an extension of Larson (1985) to include 
LF movement of the disjunction phrase, and the other was an extension of 
Schwarz’s (1999) gapping analysis on etY/ier.. or constructions to whether...or 
constructions. 

Here, I suggest that we go back to Larson’s (1985) analysis, but this time 
abandon his assumption that negation always introduces unselective existen- 
tial closure. In other words, as in Larson, let us assume that disjunction scope 
in yes-no questions is determined by the movement of the empty whether- 
operator from the disjunction phrase, but unlike Larson, let us allow this op- 
erator to move over negation. This is well-motivated given the fact that dis- 
junction can take scope over negation even in negative declaratives in certain 
contexts, as was shown in (8).^ 

Allowing the empty whether-opemtor to move over negation allows dis- 
junction to scope over negation in a nor-question like (7). This correctly per- 
mits the alternative question reading that Larson’s original account ruled out. 
But now disjunction can scope over negation in n ’?-questions as well, which 

2 

Although negative declaratives with a disjunction phrase do allow a reading where 
the disjunction takes scope over negation, the fact is that the most easily accessible 
reading is the one where negation scopes over the disjunction. 1 leave open the question 
as to why this should be so. 
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we know lack the alternative question reading. An explanation of this lack, 
therefore, cannot come from the syntax alone. 

I propose that the the syntax indeed allows both the alternative question 
and the yes-no question readings for n ’f-questions as well as nof-questions. 
But the syntax interacts with the interpretive component of the grammar to 
rule out the alternative question reading for n 7-questions. That is, the alterna- 
tive question reading gets ruled out for n 7-questions because the interpretation 
contributed by n 7-questions and the interpretation contributed by alternative 
questions are incompatible with each other. 

Direct negative yes-no questions formed with n 7 are associated with a 
special conventional implicature which cannot be cancelled. 

(23) a. Isn’t John intelligent? 

b. Is John not intelligent? 

c. Is John intelligent? 

Yes-no questions formed with n 7 implies that the speaker has a bias to- 
wards the answer: s/he expects the answer to be in the affirmative. The ques- 
tion in (23a) is used when the speaker expects the hearer to simply agree that 
John is intelligent by answering yes, or when s/he believes that John is intel- 
ligent but s/he is surprised that the hearer does not seem to share this belief. 
However, yes-no questions formed with not do not necessarily have this impli- 
cature. (23b) can be a polite way of asking whether John is stupid. Moreover, 
the affirmative yes-no question in (23c) does not imply that the speaker has a 
bias towards an answer either. It is a neutral way of asking whether John is 
intelligent or not. 

As for the alternative questions, they do not imply that the speaker has a 
bias towards the answer. They presuppose that the answer to the question is 
either of the alternatives posed by the question, but they do not imply that one 
answer is more likely to be true than the other. 

(24) Did John drink coffee or tea? 

For instance, (24) under the alternative question reading does not imply 
that the speaker expects that it is more likely that John drank coffee or that 
John drank tea. 

Now to explain the problem at hand, the conventional implicature asso- 
ciated with n 7-questions is not compatible with alternative questions. The 
implicature associated with an n 7-question is that one particular answer is 
presupposed to be true. But alternative questions by definition cannot have 
any conventional signal as to which of the possible answers is presupposed to 
be true. This means that given an alternative question interpretation, it would 
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be impossible to calculate the implicature associated with the n 7-question. 
I postulate that this conflict cancels the alternative question reading for n ’t- 
questions rather than canceling the implicature associated with it. In contrast, 
nor-questions and affirmative yes-no questions are not associated with the im- 
plicature that the speaker has a bias towards an answer. And so, they can be 
interpreted as alternative questions. 

Recall from section 3 that indirect yes-no questions allow both the yes-no 
question reading and the alternative question reading for n 7-questions as well 
as nor-questions (and also for affirmative indirect yes-no questions), as shown 
in (25) ((25a) and (25b) are repeated from (9)). 

(25) a. I asked John whether he doesn’t eat beef or chicken. 

b. I asked John whether he does not eat beef or chicken. 

c. I asked John whether he eats beef or chicken. 

This is predicted by the non-syntactic approach proposed here. Indirect 
n 7-questions are not associated with the implicature that the questioner ex- 
pects the answer to be in the affirmative, just as in indirect nor-questions and 
indirect affirmative yes-no questions. 

(26) a. I asked Mary whether John isn’t intelligent. 

b. I asked Mary whether John is not intelligent. 

c. I asked Mary whether John is intelligent. 

Under the non-syntactic approach, the alternative question reading is ex- 
pected to be available for the indirect yes-no questions in (25) because they 
are not associated with a conventional implicature that is incompatible with 
the alternative question reading. 

6 Conclusion 

In this paper, I have made a novel observation about negative yes-no ques- 
tions in English: namely, the alternative question reading is available for not- 
questions but not for n'r-questions. I have argued that the interpretive asym- 
metry attested between n 'r-questions and nor-questions cannot be accounted 
for in syntax. Instead, I have proposed that the syntax makes available both 
the yes-no question and the alternative question readings for n 'r-questions as 
well as nor-questions, but the alternative question reading is ruled out for n ’t- 
questions due to the incompatibility in the interpretation contributed by n't- 
questions and alternative questions. That is, n’r-questions are associated with 
the conventional implicature that the speaker expects the answer to be in the 
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affirmative and this implicature is not compatible with alternative questions. 
Although the question remains as to why wV-questions are associated with 
this implicature, if the conclusions reached in this paper are correct, the inter- 
pretive asymmetry in w 7-questions and wor-questions is another case that has 
implications for the close interaction between structure and interpretation in 
the grammar. 
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A Categorial Syntax for Verbs of Perception 

Robin Clark and Gerhard Jager 



1 Introduction 

“Categorial Grammar” is not a particular grammar formalism, let alone a the- 
ory of grammar, but rather a cover term for a family of quite diverse ap- 
proaches to natural language syntax. This cover term is nonetheless useful, 
since all these theories share important characteristics. Besides the common 
foundation in the works of Ajdukiewicz (1935) and Bar-Hillel (1953) and the 
use of complex syntactic categories built up from atoms with slashes, they are 
based on two related premises that distinguish them from all other theories of 
grammar: 

1. The locus of grammatical generalizations is the lexicon. 

2. Constituent structure plays no role in grammatical theory. 

This does not entail that Categorial Grammars deny the existence of con- 
stituent structure (in fact, Bar-Hillel’s Basic Categorial Grammar and Lam- 
bek’s (1961) non-associative grammar calculus assume a rigid binary branch- 
ing structure). However, all Categorial Grammars assume that constituent 
structure cannot enter grammatical description. 

The hypothesis that constituent structure is immaterial to grammatical de- 
scriptions contrasts sharply with the perspective found in the generative tradi- 
tion. Generative grammar is largely grounded on relations like c-conunand, 
m-command and government which are based on tree geometry. Some the- 
ories have defined grammatical relations like subject and object entirely in 
terms of constituent structure; for example, the subject of a category X is that 
nominal which occurs the Spec(X). 

Since such a strategy is not viable in Categorial Grammar (CG hence- 
forth), researchers working in this tradition usually do without notions like 
“subject” etc. The bundle of properties that are associated with subjects are 
considered to be logically independent. So it seems that CG misses an impor- 
tant generalizations. 

The paper tries to counter this objection by demonstrating that the con- 
figurational notion of “subject” in fact leads to analyses that are descriptively 
inadequate. This point will be made by a case study of naked infinitive (hence- 
forth: NI) perception reports as in: 

U. Penn Working Papers in Linguistics, Volume 6.3, 2000 



16 



ROBIN CLARK & GERHARD JAGER 



(1) Jackie saw Oswald shoot Kennedy. 

Several tests indicate that the accusative NP Oswald is the subject of the 
embedded VP shoot Kennedy. Under a configurational notion of “subject” this 
implies that the string Oswald shoot Kennedy forms a sentential constituent. 
On the other hand, there is firm evidence both from syntax and semantics that 
this string should not be considered a constituent. We will attempt to show that 
(a) in a categorial setting, some of the subject properties of the accusative NP 
in NI perception reports can be derived without recourse to constituent struc- 
ture, and (b) that this frees the way to a fairly simple semantics of perception 
verbs that solves most puzzles from the literature in a straightforward way. 

2 Subject Properties and NI Perception Reports 

Compare the following two sentences: 

(2) a. John saw that Bill left, 
b. John saw Bill leave. 

Examples (2a) and (2b) both involve John’s perception of something, although 
their entailments are rather different. Example (2a), epistemic perception, en- 
tails that John has perceived that Bill left and has understood that Bill left. In 
other words, John has understood the content of his perceptions; the following 
should be anomalous: 

(3) John saw that Bill left but he didn’t know it. 

Notice that example (2a) does not entail that John actually saw the event of 
Bill’s leaving. He could, in fact, have drawn the inference that Bill left through 
a fairly complex chain of deductions. 

Example (2b) is quite different. In this case, John must actually have 
visually perceived the event of Bill’s leaving although he may not have under- 
stood that that was what he saw. As Barwise (1981) observes, the following 
sentence: 

(4) Nixon saw Mrs. Wood erase the tape. 

does not imply that Nixon understood what he was seeing. He may have 
thought that she was engaged in a peculiar calisthenic exercise, for example. 
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Barwise argues that the tensed complement to a perception verb, as in (2a) in- 
volves “epistemic perception,” that is, perception with some cogitation, while 
the naked infinitive complement, as in (2b), involves non-epistemic percep- 
tion, that is raw perception without any additional non-perceptual cogitation. 
Thus, I can see John embezzle money without seeing that John is embezzling 
money simply because I can perceive events in the world without understand- 
ing their import. 

These intuitions have led to the standard analysis of the syntactic proper- 
ties of verbs of perceptual report. Epistemic perception is a relation between 
an individual and a proposition. Assuming that the syntactic category CP cor- 
responds to propositions, then, syntactically, this corresponds to a CP comple- 
ment to the perception verb. What about non-epistemic perception? Barwise 
argued that the proper syntactic analysis was along the lines shown in (5): 

(5) [vp see[A^p John [v" run]]] 

where XP is some category distinct from CP. The representation in (5) is 
meant as a cover for a set of analyses that take the immediately post- verbal NP 
as forming a constituent with the naked infinitive. A direct consequence of this 
analysis is that the denotation of XP must be distinct from CP since (2a) and 
(2b) are not synonymous. Since CPs denote propositions, XP, whatever its 
category, must denote something other than a proposition; Barwise argues that 
XP must denote a scene, a visually perceived situation. This, in turn, lends 
support to his thesis that situations are a basic semantic category. 

Syntactic facts prima facie support this syntactic analysis. To start with, 
non-thematic elements can occur in the postverbal position (as pointed out by 
Gee 1977:468): 

(6) a. John saw it rain. 

b. ?I’ve never seen there be so many complaints from students before.^ 

c. John saw the shit hit the fan. 

Example (6a) shows that weather it can occur in the postverbal position. In 
(6b), presentational there occurs in this position and in (6c) an idiom chunk 
can occur in this position with its idiomatic interpretation. Note that weather 
it, presentational there and idiom chunks have the property that they are, in 
some sense, non-referential. 

Consider, first, examples (6a) and (6b). According to classical Govern- 
ment-Binding theory, the only way that the postverbal NP could be a direct 

‘This is given as grammatical by Gee (1977). 
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object of the verb is if the verb assigns it a thematic role; this is the content 
of the 6-Criterion. By “direct object” we mean, of course, that the NP is a 
sister to the verb in the parse tree. But if the position were associated with 
a thematic role, then non-referential elements would be excluded from that 
position. The only syntactic position that is both non-thematic and associated 
with a grammatical function is the structural subject position and, therefore, 
non-referential elements are restricted to this position unless they are part of 
an idiom that includes the entire verb phrase as well. TYiming to example (6c), 
we see that the postverbal NP receives its idiomatic interpretation. Therefore, 
it is non-referential and cannot be a sister to the main verb. This again shows 
that the postverbal NP in NI complement examples is a structural subject and 
not a direct object. The only way to satisfy this condition is if the postverbal 
NP forms a constituent with the naked infinitive. 

Thus, NI constructions seem to class with so-called “Exceptional Case 
Marking” constructions, shown in (7), and small clause constructions^, shown 
in (8), in allowing a non-thematic element to interceded between the verb and 
the embedded predicate. 

(7) a. John believes Bill to have stolen the car. 

b. John believes it to be raining. 

c. John believes it to be obvious that Bill stole the car. 

d. John believes there to have been a riot in the park. 

e. John believes the shit to have hit the fan. 

(8) a. John considers Bill a genius. 

b. John considers it obvious that Bill stole the car. 

The crucial point here is that the presence of “a-thematic” material is diag- 
nostic of the grammatical function subject-, the grammaticality of the (b-e) 
examples in (7) show that the postverbal NP is a true subject of the following 
predicate and not the structural object of believe. We must, therefore, contrast 
the behavior of the postverbal NP in (7) with its behavior in an object control 
construction: 

(9) a. John persuaded Bill to steal the car. 

b. *John persuaded it to rain. 

c. *John persuaded it to be obvious that Bill stole the car. 

^Because their predicates are not verbal, small clause constructions do not show the 
same range of non-thematic material in the following position. We take this as largely 
tangential to our main point. 
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d. *John persuaded there to be a riot in the park. 

e. *John persuaded the shit to hit the fan. 

The contrast between the ECM constructions in (7) and the control construc- 
tions in (9) present CG with an interesting problem. Generative grammar ac- 
counts for the contrast by associating subject properties with a particular piece 
of tree geometry, where, by subject property we mean things like: 

(10) a. The subject is allowed to be non-thematic; 

b. The subject is the “target” (or “landing site”) of raising operations; 

c. The subject is a “licensed” controller; 

d. The subject is a “trigger” for certain agreement relations; 

e. The presence of a subject defines local domains for binding. 

f. Subjects are islands to extraction. 

The list in (10) can, of course, be expanded and clarified. Our point is that 
in classical generative accounts all of the properties in (10) are unified under 
a particular geometric approach to grammatical relations; thus, establishing 
one of the properties in (10) is sufficient to establish constituent structure and 
endow to element in question with the full array of subject properties. 

Grammatical subjects are traditionally treated as possible landing sites 
for raising processes like subject-to-subject raising (SSR) and passive. NI 
constructions admit get passives but be passives are more marked: 

(11) a. John saw Bill get examined by a doctor, 

b. *?John saw Bill be examined by a doctor. 

Furthermore, the post-verbal position in NI constructions admits only a few 
cases of SSR: 

(12) a. John saw Bill appear to unlock the safe. 

b. ?John saw Bill seem to escape from the handcuffs. 

c. *John saw Bill be likely to drink too much. 

d. *John saw Bill tend to drive on the wrong side of the road. 

The unacceptability of examples (12c) and (12d) are easily accounted for on 
the basis of semantic properties of the embedded predicate; it is difficult to 
imagine the exact visual manifestations of being likely to drink too much and 
tending to drive on the wrong side of the road, both properties being propensi- 
ties that should be treated modally. Appearing and seeming, on the other hand. 
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can involve deliberate deceptions that can be visually realized — stage magi- 
cians make this their stock in trade. Because of this intentionality, appear and, 
to a lesser extent, seem may involve semantic relations between the “raised” 
subject and the predicate that are unavailable in the true raising constructions 
associated with likely and tend. Similarly, get passives may be preferred over 
be passives in NI constructions because of secondary semantic properties as- 
sociated with the former but unavailable in the latter; compare, for example, 
the contrast between get and be in certain imperative constructions: 

(13) a. Don’t get killed. 

b. *Don’t be killed. 

The contrast in (13) is probably attributable to differences in the aspectual 
properties associated with get and be. These differences may also account for 
the contrast between (1 la) and (11b). In particular, be passives tend to have a 
more stative flavor than get passives, a fact which may limit their distribution 
in NI constructions. As was the case for the distribution of pleonastics, then, 
raising and passive provide only equivocal support for the subject status of the 
post-verbal NP in NI constructions; while the post- verbal NP does show some 
subject properties, other factors associated with the semantics of perceptual 
reports intervene. 

A further subject property involves the distribution of anaphors and here 
the facts are much more straightforward. Putting aside formal details, let us 
suppose, following classical Government-Binding Theory, that subjects create 
a minimal domain for binding; that is, the presence of a structural subject on 
a constituent guarantees that a syntactic anaphor like himself or each other 
must be bound within that constituent while pronominals like her or them 
must be unbound in the same domain. It follows that if the postverbal NP in NI 
constructions is a structural subject, then it should create a minimal domain for 
binding. The following data are consistent with this view of binding domains: 

(14) a. John saw Mary scratch herself. 

b. *John saw Mary scratch himself. 

c. John saw Mary scratch him. 

d. *John saw Mary scratch her. 

Examples (14a) and (14b) show that anaphors like herself must indeed find 
their antecedent within the domain defined by the postverbal NP; Mary in 
example (14a) is proximate to the anaphor herself and, so, is a legal antecedent 
for it; John in example (14b) is too distant to serve as a legal antecedent for 
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himself since the NP Mary inscribes an opaque domain for binding due to 
its status as a subject. Similarly, examples (14c) and (14d) show that John 
can be a possible antecedent for the pronoun him because Mary defines the 
minimal domain within which the pronoun must be free. Equally, the pronoun 
her in (14d) cannot be coreferential with Mary because the latter is within the 
pronoun’s minimal domain; it cannot be coreferential with John because they 
disagree in gender. 

Finally, we note that subjects tend to be islands to extraction: 

(15) a. *whOj did friends of ti visit Bill? 

b. *which saintj does Fred consider stories about ti utter fabrications? 

The ungrammaticality of the examples in (15) can be attributed to the fact that 
the wA-element is associated with a gap inside a subject, a tensed clause in 
(15a) and an small clause in (15b). The post- verbal NP in NI constructions is 
likewise an island for extraction: 

(16) *whOt did John see a friend of ti steal a car? 

The analysis of the post-verbal NP in NI constructions as a true subject can 
immediately treat (16) as a violation of the islandhood of subjects. 

As we have seen, the accusative NP in NI perception reports shows prop- 
erties that are traditionally taken to be indicative of subjects. Under the con- 
figurational definition of “subject”, it is thus inevitable to consider the string 
[N PaccV Pinf] as a constituent. 

In the next section we will collect a series of syntactic arguments that 
challenge this conclusion. 

3 Other Syntactic Tests for Constituency 

Observations concerning coordination and anaphora also point towards a one- 
constituent analysis. This can be seen from examples (17a) and (17b). 

(17) a. John saw Mary enter and Bill leave? 

b. John saw Mary enter, and Bill saw it too. 

Akmajian (1977) points out that virtually all tests for constituency apart from 
coordination and pronominalization indicate that the complement of NI per- 
ception verbs do not form a constituent though. So they cannot appear in the 
postcopular position of pseudoclefts: 
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(18) *What we saw was Raquel Welch take a bath. 

Neither can they be inserted into clefted positions: 

(19) *It was Raquel Welch take a bath that we saw. 

They cannot be right node raised: 

(20) *?We could hear, but we couldn’t see, Raquel Welch take a bath. 

Finally, they cannot undergo object deletion: 

(21) *Raquel Welch take a bath is a breathtaking sight to see. 

Additional evidence against a one-constituent analysis comes from top- 
icalization in German. The underlying sentence structure in German is verb 
final. Main clauses display V-2, i.e., one constituent is obligatorily fronted, 
and the finite verb is placed immediately after this constituent. Thus if a string 
is a constituent, we expect that it can be topicalized. Let us apply this test to 
NI perception reports. The underlying word order can be seen in an embedded 
clause like (22): 

(22) weil der Polizist jemanden fliehen gesehen hat. 

since the policeman[nom] somebody[acc] escape[inf] seen[part] has. 
‘since the policeman saw somebody escape.’ 

The topicalization test indicates that N Pace + VP do not form a con- 
stituent, while the sequence “embedded VP+matrix Verb” do: 

(23) a. ??? Jemanden fliehen hat der Polizist gesehen. 

Somebody escape has the policeman seen, 
b. Fliehen gesehen hat der Polizist Jemanden. 

Escape seen has the policeman somebody. 

So the appropriate bracketing for (22) should be (24a) rather than (24b): 

(24) a. weil der Polizist [ jemanden [fliehen gesehen hat]] 
b. weil der Polizist [jemanden fliehen] gesehen hat. 

All these observations indicate that the appropriate syntactic structure for 
NI perception reports should be like the trees in (25) for English and Ger- 
man respectively rather than the structure in (5), neither of which involves an 
embedded small clause. 
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4 Vlach’s Puzzle 

All detailed studies of the semantics of perception verbs that we are aware of 
start with a one-constitUent analysis (see for instance Barwise 1981, Higgin- 
botham 1983, Vlach 1983, van der Does 1991). And even though the ontolog- 
ical background differs considerably, they agree on the following: 

1. Perception is a relation between an agent and an abstract object (scene/si- 
tuation, event, partial model etc.). 

2. The [NPaccVP] constituent in NI perception reports denotes a set of sit- 
uations (events ...). 

3. John sees NP VP can be paraphrased as John sees a situation (event ...) s, 
ands e ||[A^P VP]||. 

Let us suppose, as seems reasonable, that active and passive sentences 
are supported by the same set of scenes and, so, denote the same proposi- 
tion.^ In particular, (26a) and (26b) are true paraphrases, differing only in 
their pragmatic contributions and that (26c) differs from the other two only in 
the contribution of get to the interpretation of the sentence: 

(26) a. Oswald assassinated Kennedy. 

b. Kennedy was assassinated by Oswald. 

c. Kennedy got assassinated by Oswald. 

Furthermore, let us follow the standard assumption that the semantic contribu- 
tion of a passive sentence in an embedded context is exactly comparable, up 
to pragmatics, to the semantic contribution of an active sentence in the same 
context; thus, (27a) is a paraphrase of (27b): 

(27) a. John saw that Oswald assassinated Kennedy. 

Vlach likely wouldn’t agree with this. His view is discussed below. 
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b. John saw that Kennedy was assassinated by Oswald. 

Notice, in particular, that no privileged relationship holds between John and 
either Oswald or Kennedy in either (27a) and (27b). 

Compare this situation in (27) with the pair of sentences in (28), first 
observed by Vlach (1983): 

(28) a. John saw Oswald shoot Kennedy. 

b. John saw Kennedy get shot by Oswald. 

The behavior of the sentences in (28) is peculiar given the small clause analysis 
of NI complements, since (28a) and (28b) are not paraphrases of each other 
and differ by more than the contribution of get to (28b). In particular, for (28a) 
to be true it must be the case the John saw Oswald exactly when the latter shot 
Kennedy; John need not have seen Kennedy at all. For (28b) to be true, on the 
other hand, John must have seen Kennedy at the moment that he got shot by 
Oswald; he need not have seen Oswald at all. Thus, while (28a) is not true of 
anyone, many people have shared John’s visual experience in (28b). 

In brief, it would seem that the subject of the perception verb and the 
postverbal NP stand in some special relationship in non-epistemic perceptual 
reports, a relationship that is wholly absent in epistemic perceptual reports. To 
be more precise, we claim that the inference pattern in (29a) is valid, but the 
one in (29b) isn’t. 

(29) a. X saw yVP^x saw y 

b. X saw y [yp V z]^x saw z 

The invalidity of (29b) is demonstrated by (28). To substantiate the claim 
that (29a) is valid, let us consider three putative counterexamples. Suppose, 
first, that John is standing behind an opaque plastic screen, using magnets to 
move metal puppets on the other side of the screen. Suppose Mary observes 
the movement of the puppets, without seeing John. Can Mary use (30) to 
report her perception? 

(30) I saw John move the puppets. 

Gee (1977) claims that she can. The judgments of those we have asked is that 
although (30) is marginal in this context, it can be so used just in case Mary 
is absolutely certain that no one else could be responsible for the movement 
of the puppets. In this case, seeing puppet movement is tantamount to direct 
perception of John. 
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Similarly, consider the case where Mary is separated from a forest by a 
large hill, so that she cannot see the forest (this example is also due to Gee 
1977). Observing a huge billow of smoke rising over the hill, can Mary later 
use (3 1 ) to report her experience? 

(31) 1 saw the forest burn. 

Again, the consensus of those we have asked is that (31) is odd in the above 
context. We can sharpen the intuition by considering the following sentence: 

(32) I saw the forest burn even though I didn’t see the forest. 

According to our intuitions, this sentence is contradictory, no matter what 
background knowledge we assume. 

The same argumentation applies ceteris paribus to an argument from van 
der Does (1991), who in turn attributes it to Robin Cooper. He raises the 
question whether an entailment relation holds between (33a) and (33b): 

(33) a. Daniel saw Lucia phone Henry, 
b. Daniel saw Lucia. 

Van der Does (op. cit., p 245) discusses the following scenario: “Imag- 
ine Lucia, Henry and Daniel each sitting in separate rooms. There are phones 
which enable Lucia and Henry to speak to each other, but only when Lucia 
phones Henry an oscilloscope in Daniel’s room will show a patterns charac- 
teristic for Lucia’s voice. Now suppose Daniel saw the patterns, can one report 
the fact by saying [(33a)]?’’ Van der Does claims that at least some subjects 
answered affirmatively, since “perceiving the pattern on the oscilloscope is 
perceiving a representation of Lucia, much as perceiving a video-recording of 
her would have been. And clearly in the latter sense [(33a)] might be used.” 
We agree that (33a) might be true in such a situation, but so would (33b), and 
for the very same reason. In other words there might be some vagueness as to 
how direct direct perception should be, but the inference pattern is not affected 
by that. 

To sum up the discussion so far, we observe a special semantic relation- 
ship between the matrix subject and the accusative NP in NI perception re- 
ports. In other words, NPacc^P don’t form a semantic unit. This indicates 
that they don’t form a syntactic unit either. 

It should be mentioned that Vlach (1983), who was presumably the first 
to notice this special relationship, nevertheless uses a one-constituent analysis. 
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According to him, the difference in meaning between (28a) and (28b) is due 
to the fact that the denotation of Kennedy shot by Oswald consists of events 
that include Kennedy’s location, while Oswald shoot Kennedy denotes a set of 
events that are locally connected to Oswald. Vlach doesn’t give an explana- 
tion for this asymmetry, but apparently he assumes the event descriptions that 
include a subject denote events that are located at or around the location of the 
referent of the subject. 

To test this assumption, consider (34). 

(34) Jackie saw Oswald’s assassination of Kennedy. 

Despite the fact that the subject of the event description is Oswald and 
Jackie didn’t see Oswald, the sentences is true. So the location of an event that 
is described by an event noun is not determined by the location of the referent 
of the subject. Events that are described by tensed sentences do not confirm a 
special status of the subject either. 

(35) a. Oswald shot Kennedy. 

b. ?That happened in the Texas Book Depository. 

c. ?That happened in the Presidential Limousine. 

d. That happened in Dallas. 

It appears that an event that is described by a tensed clause has to include 
all participants, not just the referent of the subject. So we may conclude that 
our argumentation above is supported. The accusative NP and the embedded 
VP shouldn’t be considered to be a semantic unit. 

What about the arguments in favor of a one-constituent analysis? There 
were two that didn’t rely on grammatical functions, coordination and anaphora. 
In the next section, we will demonstrate that the former argument is not con- 
clusive; non-constituents may be conjoined. Anaphora isn’t a conclusive ar- 
gument either. It is generally held nowadays that anaphora resolution operates 
on semantic entities rather than on syntactic constituents. We will argue below 
the meaning of (17b) does involve the event of Mary’s entering, even though 
it does not correspond to any constituent. So we expect anaphoric reference to 
it to be possible. 

5 The Semantics of Verbs of Perception 

As our starting point we take the semantics of verbs of perception as proposed 
by Higginbotham (1983). This decision is of little significance, other propos- 
als like Barwise’s or van der Does’ could be modified in a similar fashion. 
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Higginbotham assumes that verbs that can occur in the complement of NI per- 
ception reports have an event argument, and that the logical form of a sentence 
like (36a) is (36b). 

(36) a. John saw Mary leave. 

b. 3e(LEAVE(M, e) A John sees e) 

The variable e ranges over events here. The verb see that occurs with NI 
complements is thus semantically reduced to simple transitive see. As ar- 
gued above, this semantics cannot be correct since then seeing Oswald shoot 
Kennedy would come down to seeing the whole event of Oswald’s assassina- 
tion of Kennedy, which in turn entails seeing Kennedy. The truth conditions 
of 

(37) Jackie saw Oswald shoot Kennedy. 

are much weaker. To establish its truth, it is sufficient that Jackie saw that 
part of the complex assassination event the directly involved Oswald, i.e. his 
aiming and pulling the trigger. To accommodate this intuition, let us assume 
that for each participant x of an event e, there is a unique subevent Ex of e that 
has X as its only participant. We won’t spell out this operation formally here, 
but the intuition should be clear enough. So the logical form of (37) should be 

(38) 3e(SHOOT(LHO, JFK, e) A Jackie sees glho) 

It goes without saying that such a logical form can only be derived composi- 
tionally if the accusative NP Oswald is an argument of the matrix verb. This 
in mind, we can give the following lexical semantics of see: 

(39) APxi/.3e(Pxe A SEE(i/,ea;)) 

This is compatible with the following syntactic category of see, the Categorial 
counterpart to the second structure in (25). 

(40) {N\S)/VPI/N 

We leave irrelevant morphosyntactic details open. In particular, we do not spell 
out the internal structure of NI VPs but abbreviate its category with V PI. 

In the sequel we will show that this semantics of verbs of perception, 
paired with a categorial syntax, meets the main criteria that are discussed in 
the literature. 
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Verdicality This is Barwise’s name of the inference scheme 

(41) John saw Mary leave |= Mary left. 

In an event based semantics, the logical form of Mary left is 3e : LEAVE(m, e). 
As under Higginbotham’s original account, this follows from the premise by 
simple first order reasoning. 

Extensionality All elements of an NI perceptual report are transparent, i.e., 
they can be replaced by extensionally equivalent expressions salva veritate. 
Since no intensional operators are involved in our semantics of see, this is 
predicted. 

Absence of scope ambiguities Generally, all scope inducing items that might 
occur in one of the complements of see have matrix scope. Under our ap- 
proach, this has nothing to do with the semantics of see but follows from its 
syntax. To start with, quantifiers in the accusative position always have matrix 
scope, for instance: 

(42) John saw Q leave = Qx are such that John saw x leave. 

Since the quantifier occupies an argument position of the matrix verb, it must 
at least take scope over the matrix VP, no matter what particular approach to 
quantifier scope we adopt. 

Coordination behaves similarly, i.e., the following two equivalences hold. 

(43) a. John saw Mary swim and Bill walk = John saw Mary swim and John 

saw Bill walk. 

b. John saw Mary swim or Bill walk = John saw Mary swim or John 
saw Bill walk. 

In any version of CG, conjunctions are considered polymorphic items. 
Their category is X\X/ X, where X ranges over Boolean categories.'* Rough- 
ly, a category is Boolean iff the corresponding semantic type ends up in t. So 
S,N \ S,CN etc. are Boolean categories. The meaning of the coordination 
and is XQPx.Px A Qx. So in the first sentence in (43a), the substrings Mary 
swim and Bill walk have to be assigned a Boolean category each to make them 
conjoinable. 

‘‘Steedman’s (1996) syncategorematic treatment of conjunctions amounts to the 
same thing. 





A CATEGORIAL SYNTAX FOR VERBS OF PERCEPTION 29 



Under any version of CG, the accusative NP and the NI phrase cannot be 
combined directly to yield a Boolean category. Thus as in the case of quantifier 
scope, the absence of a narrow scope reading is expected. We have to answer 
the question how the wide scope reading is to be derived though. 

Up to the present point, we remained neutral as to which version of CG 
is to be used. To handle this puzzle, we have to be more specific. Since 
Mary and swim do not form a functor-argument structure here, we need a 
certain degree of associativity to deal with this instance of non-constituent 
coordination. So the example can be handled in any version of Combinatory 
Categorial Grammar (CCG, cf. Ades and Steedman 1982) that contains the 
operation of function composition, and in any descendant of Lambek’s (1958) 
associative CG. 

As shown in Fig. 5,^ the reading in question can be derived in CCG using 
only type lifting and backward function composition. We abbreviate N\S as 
V P for convenience. The predicate SEEi is shorthand for the meaning of see 
(cf. (39)). Since both combinators are theorems of the Lambek calculus, this 
is simultaneously a Lambek derivation. 

Failure of logical equivalence Although the complements of perception 
verbs can be combined by the classical propositional connectives, comple- 
ments that are equivalent in classical logic cannot always be exchanged salva 
veritate. For instance, (44b) doesn’t have a reading that is equivalent to (44a). 

(44) a. Hegel saw Schelling sneeze. 

b. Hegel saw ((Schelling sneeze and Holderlin eat) or (Schelling sneeze 
and Holderlin not eat)). 

To handle this problem, it has to be remarked that even though the use 
of propositional connective in the “complement” of perception verbs looks 
suggestive, we assume a different treatment of conjunction and disjunction on 
the one hand, and of negation on the other hand. The former connectives al- 
ways receive a wide scope interpretation, while negation is predicate negation. 
Therefore we do not expect patterns that look like classical validities to be 
sustained. Under this treatment (44b) is synonymous with (45). 



^The Combinatory branch of CG uses a format for complex categories and for 
derivation trees that differs somewhat from the Lambek trad ition. We chose a com- 
promise here in using the backslash in Lambek’s sense (“A \ B” takes an argument of 
type A and yields a value of type B) while choosing a CCG-style derivation tree. 
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(45) (Hegel saw Schelling sneeze and Hegel saw Holderlin eat) or (Hegel saw 

Schelling sneeze and Hegel saw Holderlin not eat). 

By simple propositional reasoning we can mitx Hegel saw Schelling sneeze 
from this. But furthermore we infer that there is an event e that involves 
Holderlin and that is seen by Hegel. This does not follow from (44a), so (44a) 
and (44b) cannot be equivalent. 

The puzzle of Russell’s schoolchildren Barwise gives a further desidera- 
tum for an adequate semantics of perception reports which js illustrated by the 
following inference scheme. 

(46) a. Russell sees each boy touch at least one girl. 

b. Russell didn’t see any girl being touched by more than one boy. 

c. 1= There are at least as many girls as boys. 

As Vlach correctly observes, this inference scheme is not valid. Imagine a 
gameshow where 10 boys have to find a partner among 5 girls. The participants 
cannot see each other, but everybody sits in a booth with several phones each 
of which connects to exactly one participant of the opposite sex (so the boys 
have 5 phones and the girls 10 phones each). Each boy calls one girl, and it 
happens that each girl receives exactly two phone calls. So each girl picks up 
two receivers simultaneously and holds them to her ears (one receiver per ear). 
The TV audience can see all 15 participants, but they can only see the left side 
of the girls. Russell was watching this silly show on TV. In this situation (a) 
and (b) are true, but (c) isn’t: 

(47) a. Russell saw each boy calling at least one girl. 

b. Russell didn’t see any girl being called by more than one boy. 

c. There are at least as many girls as boys. 

This is problematic for the theories of Barwise, Higginbotham, and van 
der Does, since they uniformly predict that if Russell sees a calling 6, he also 
sees h getting a call by a. So if Russell would see two boys calling the same 
girl, he would see this girl getting a call from two boys. This makes the argu- 
ment valid. Since we don’t claim that Russell sees a girl called if he sees a boy 
call her, no such prediction is made. 

To sum up this section, we tried to demonstrate that a fairly innocent mod- 
ification of Higginbotham’s proposal is sufficient to accommodate Vlach’s 
puzzle while preserving its general advantages. Likely a similar adjustment 
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could be made with other theories of the semantics of perceptual reports. 
If we insist on compositional interpretation, this adjustment excludes a one- 
constituent analysis of the syntax of perceptual reports though. This in turn 
forces us to adopt a syntactic theory that is able to handle non-constituent co- 
ordination, as most versions of CG do. 



6 Conclusion 

In this paper, we have presented some arguments for a reanalysis of NI com- 
plements to verbs of perceptual report. We have argued that the semantic 
analysis that takes the NI complement as a constituent denoting a scene or 
situation fails to provide a satisfying account of certain entailments; these se- 
mantic properties follow directly on our account, which does not treat the NI 
complements as a single semantic unit. 

Furthermore, our account handles many of the syntactic properties asso- 
ciated with NI constructions; indeed, it seems to fare at least as well as the 
standard small clause account. We are, however, left with a residual problem: 
How can we account for the subjectlike properties of the post- verbal NP? On 
the standard account, the subject properties of this NP follow because these 
properties are correlated with tree geometry. We believe, however, that this 
approach to grammatical relations requires an undesirable loosening of the 
relationship between the syntax and the semantics. 

So while subject properties should not be considered as evidence for a 
particular constituent structure, they require an explanation nevertheless. As 
far as the Binding facts are concerned, this might be fairly straightforward 
if we assume that Binding means linking of the anaphor to an superordinate 
argument place of the local verb (see for instance the proposals of Szabolcsi 
1988 of Hepple 1990). Under this perspective, the domain of Binding is the 
local VP rather than the local clause. The other subject properties discussed 
above have to be left as an open problem, however. 

One virtue of CGs is that they maintain a homomorphic relationship be- 
tween syntax and semantic structures. While CGs have a pleasingly axiomatic 
structure that clarifies the relationship between natural language syntax and 
logic, they provide no obvious account of grammatical relations. We believe, 
that one task for the grammarian is to elucidate the role that grammatical rela- 
tions play both in syntax and in semantics. We have not, however, given such 
a theory in this draft, contenting ourselves with posing the problem as clearly 
as we could. 
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Defective Complements in IVee Adjoining Grammar* 

Seth Kulick, Robert Frank and K. Vijayshanker 

1 Introduction 

Syntactic theory has long made use of the idea that clausal complements can be 
different sizes. For example, while the finite complement to believes in (la) 
projects up to CP, the nonfinite complement in (lb) projects only to IP. The 
most obvious reason for this approach is, of course, that the finite complement 
can have a complementizer, while the nonfinite one cannot. This is, in turn, 
related to accounts of how Case can be assigned to the complement subject 
when the complement is IP but not CP.‘ Another example of how smaller 
complements are used in syntactic theory is of course the case of subject-to- 
subject raising, as in (2a). Such raising is only possible when the complement 
is a an IP, but not a CP (2b). Restrictions on movement are therefore correlated 
with the size of the complement. 

(1) a. John believes [cp that [/p Bill is a freak]] 

b. John believes [/p Bill to be a freak ] 

(2) a. Billj seems [/p t{ to be a freak.] 

b. * Billj seems [cp that tj is a freak.] 

This use of differing complement sizes has been extended to handle fur- 
ther types of inter-clausal movement, by increasing the options for the size of 
the complement. One particular case in which this approach has been taken is 
that of ‘clitic climbing’ in Romance, in which a clitic can sometimes appear 
in a higher clause than than the clause to which it is semantically associated. 
Going back at least to Strozer (1977), various linguists have suggested that 
clitic climbing takes place when the complement is ‘defective’, even more so 
(that is, smaller) than for the complements of raising or ECM verbs, although 
the exact size of the complement has changed depending on the analysis and 
the options available within syntactic theory. 

*We gratefully acknowledge the financial support of NSF grants SBR-898-20239, 
SBR-89- 20230, and SBR-97-1041 1, respectively, for the three authors. 

‘There are different stories about how such exceptional Case marking takes place — 
either by governing across IP, or movement of the complement subject to [Spec, AgroP] 
in the higher clause, etc. These details do not matter here, since the main point is the 
utility of using complements of different sizes. 

U. Penn Working Papers in Linguistics, Volume 6.3, 2000 



36 



KULICK, FRANK & VIJAYSHANKER 



The purpose of this paper is two-fold. First, we discuss an analysis of 
clitic-climbing within the framework of Tree Adjoining Grammar (TAG). A 
‘defective complement’ analysis was used in Bleam (1994) to account for clitic 
climbing in TAG. While the analysis is in several respects very successful, we 
point out some important cases that it is unable to handle. Indeed, follow- 
ing Bleam (1994)’s basic assumptions, it is difficult to give any such analysis 
for these cases in Tree Adjoining Grammar. Since we accept those basic as- 
sumptions, we therefore we utilize a reconceptualization of Tree Adjoining 
Grammar proposed by Frank and Vijay-Shanker (1998), Frank et al. (1999). 

While this approach allows the problems faced by Bleam (1994)’s anal- 
ysis to be handled, it in turn faces certain challenges of prohibiting locality 
violations by clitic movement. Investigating this problem leads to the second 
goal of this paper, which is to show how the same derivational machinery used 
for subject-to-subject raising and wh-movement is also used for clitic climb- 
ing, resulting in a unified analysis of inter-clausal movement in this revised 
TAG framework, while still accounting for their different properties. 

In Section 2 we present the data concerning clitic-climbing in Romance, 
and the TAG framework is introduced in Section 3. Section 4 discusses Bleam 
(1994)’s analysis and problematic cases for the analysis. Section 5 discusses 
the recharacterization of the TAG framework, and Section 6 shows how it can 
be used to solve the problems discussed in Section 4. Section 7 discusses 
the resulting analysis in more detail, showing how locality can be retained and 
how the solution fits into an overall account of inter-clausal movement in TAG, 
and Section 8 presents a short conclusion. 



2 Data : Clitic Climbing in Spanish 

We are concerned in this paper with Romance object-clitics, unstressed pronom- 
inal elements associated with the objects of a verb. The object as a full NP 
follows the verb, as in (3). In Spanish and Italian, the clitic precedes a finite 
verb (4), and follows a nonfinite verb (roughly) (5). We focus here on Spanish, 
although the same issues hold for Italian.^ 

(3) Mari no vio la pelicula 
Mari neg saw the movie 
‘Mari did not see the movie’ 



^In both Spanish and Italian, the object clitics (roughly) appear following a nonfinite 
verb, and preceding a finite verb. We abstract away from this issue here. 
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(4) 


Mari 


no la vi6 




Mari 


negit saw 




‘Mari did not see it’ 


(5) 


Mari 


quiere ver/a 




Mari 


wants to see it 



Object clitic placement is usually a clause bound operation, in which the 
clitic appears on the verb with which it is associated (or on an auxiliary verb 
in the same clause). As shown in (6), the clitic does not in this case appear on 
the higher verb, but must appear on the verb it is semantically associated with, 
in this case comer. 

(6) a. Luis insistio en comerlas 

Luis insisted on eating them 
b. * Luis las insistid en comer 

This is the ‘typical’ case. However, with a limited number of verbs, such 
as quiere, in addition to the clitic staying with the lower verb, as in (7a), it 
can also optionally appear on that higher verb, as in (7b). This is commonly 
referred to as ‘clitic climbing’, since the clitic appears to climb to a higher 
clause. I will follow Aissen and Perlmutter (1983) in referring to the verbs 
that allow such movement of the lower clitic to them, such as quiere, as the 
‘trigger’ verbs.^ 

(7) a. Luis quiere comerlas 

b. Luis las quiere comer 

‘Luis wants to eat them’ 

The puzzle of sentences such as (7b) is, of course, is that the normal local- 
ity constraint on clitic placement, as in (6), seems to be violated. Furthermore, 
the clitic can move past a series of verbs, as long as those verbs are all trigger 
verbs, as in (8): 

(8) Juan la quiere poder comprar 
Juan it wants to be able to buy 
‘Juan wants to be able to buy it’ 

^Clitic climbing is just one type of unexpectedly long movement allowed by trigger 
verbs. These different movements are commonly grouped together under the term 
‘restmcturing’ . Some of the other aspects, such as the ‘long middle-5i’ raise some 
different issues for TAG, and also interact with clitic-climbing in interesting ways. 
However, space prohibits discussion here of these other aspects of restmcturing. 
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Figure 1 : Adjoining in TAG 



3 IVee Adjoining Grammar 

The fundamental idea of TAG (Frank (1992), Kroch and Joshi (1985)) as a 
grammatical formalism is that the specification of grammatical constraints 
can be separated from the recursive processes in the grammar. This is ac- 
complished by localizing the grammatical constraints within small pieces of 
phrase structure, called elementary trees, which are combined using the ad- 
joining operation. 

Adjoining inserts one elementary tree inside the body of another, as shown 
in Figure 1. 

Trees which can be adjoined into another tree are auxiliary trees, and 
have Afoot node along the frontier which is of the same category as the root 
node. Adjoining is what allows recursive structures to be separated from the 
specification of the grammar; recursive structures are treated as auxiliary trees, 
which adjoin in to produce non-local dependencies.'^ 

The working hypothesis for all linguistic work in TAG is that the sub- 
stantive theory of syntax must be stated over the bounded local domains of 
the elementary trees. It is also taken as a basic assumption that all semantic 
arguments associated with a verb are located in the same elementary tree as 
that verb. We follow here the characterization of elementary trees proposed by 
Frank (1992), in which an elementary tree consists of the extended projection, 

‘’TAG also uses tree substitution, which by itself would only give the context-free 
power. The use of adjoining pushes TAG into the class of ‘mildly context-sensitive’ 
grammar formalisms (Joshi et al. 1990). Substihition is commonly used to insert argu- 
ments into a tree, a detail we have abstracted away from here. Substitution also plays a 
role in the definition of ‘multi-component’ TAG, as seen later in this section. 
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Figure 2: Wh-Movement in TAG 



in the sense of Grimshaw (1990), of a lexical predicate: 

(9) Condition on Elementary TVee Minimality (CETM): Every ele- 
mentary tree consists of the extended projection of a single lexical 
head. 

One example of the use of adjoining for recursive processes is given by 
the TAG analysis in Frank (1992), Kroch and Joshi (1985) of wh-movement, 
as in What do you think that Bill saw? The moved wh-movement and its trace 
are localized in single elementary tree for Whati that Bill saw ti, as shown in 
(A) of Figure 2. 

This is an example of how in TAG all movement transformations are lo- 
calized to take place in a single tree. The auxiliary tree for do you think, (B) 
in Figure 2, is a C’ auxiliary tree that adjoins in at the C’ node of (A). This 
produces the desired result in (C), which shows how adjoining accomplishes 
the same result as inter-clausal movement, in this case cyclic A’ -movement.^ 

Crucially, there is no ‘movement’ from one clause to another. All move- 
ment is internal to an elementary tree, and the appearance of inter-clausal 
movement results by segments of a tree getting stretched away from the rest 
of the tree, as illustrated by the what of (A) in Figure 2 being stretched away 
from the rest of (A) by the adjoining of (B). 

^Note that complement of think takes a C’ complement, which allows the tree to 
be used as a C auxiliary tree. We are adopting here Frank (1992)’s proposal that the 
bridge verbs (the ones that allow movement from the complement of a lower clause) 
are the ones that take a C’ complement, as opposed to non-bridge verbs such as regret, 
which take a CP complement and so cannot be used for inter-clausal wh-movement by 
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(10) John seems to like pizza 

(11) (a) IP (b) IP (c) r 



DPi 

I 

John 



r DPi 

. I 

John I 



r 



I 




like pizza 



VP 



V r 

I 

seems 



like pizza 



The same basic approach applies for subject-to-subject raising as in (10). 
The auxiliary tree in (1 Ic) is adjoined into (1 lb) at the F node, thereby ‘stretch- 
ing’ John away from to like pizza, to produce (11a). 

The operations of substitution and adjoining allow two elementary trees 
to interact with each other. A natural way to ‘loosen’ the definition of TAG 
is to allow the TAG operations to manipulate multiple trees at a time. These 
extensions are referred to as ‘multi-component’ extensions of TAG, since the 
basic components of the grammar are no longer trees, but tree sets with sev- 
eral components. One such extension, ‘tree-local multi-component TAG’ (TL- 
MCTAG), has been the most used for various problems that arise with basic 
TAG. TL-MCTAG requires that all of the members of a tree set be adjoined 
or substituted into a single elementary tree, as broadly illustrated in Figure 3. 
(A) and (B) in the figure show that two members can either both adjoin into 
another tree, or one component can adjoin while the other substitutes.^ What 
is not allowed by the definition of TL-MCTAG, though, is the scheme in (C), 
in which a tree adjoins into one component of the tree set, while the other com- 
ponent of the tree set adjoins into that tree. The consequences of this definition 
of TL-MCTAG for clitic climbing are discussed in the next section. 



adjoining in the same way that thinks can. 
^It is also possible for both to substitute. 
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Figure 3: Tree-Local Multi-Component TAG 

4 TAG and Clitic Climbing: The Problem and Previous 
Approaches 

Consider again the the case of a clitic that does not ‘climb’ , as in (7a), repeated 
here as (12a). Since the elementary tree for comer contains all the arguments 
of comer, it would naturally contain the clitic las as well. While there would 
be some issues over exactly the right way to represent the clitic in the phrase 
structure, that would be the case for any formalism, and there is no particular 
problem caused for TAG. Whatever the desired representation of the clitic is, 
it can be used in the elementary tree for the comer clause with the clitic. 

(12) a. Luis quiere comerto 

b. Luis las quiere comer 

Luis wants to eat them 

However, in a clitic climbing case such as (7b), repeated here as (12b), 
the clitic appears in the higher clause. Since, by the CETM, the clitic must be 
part of the comer elementary tree, it must therefore appear in the higher clause 
as a result of adjoining. As discussed in the previous section, the adjoining 
operation for TAG is able to ‘stretch’ away components of an elementary tree. 
For the case of w/i-movement, e.g., who does John think that Bill saw, does 
John think adjoins in, pushing who away from that Bill saw. For subject-to- 
subject raising, as in John seems to like pizza, seems adjoins in to push John 
away. In both cases, the component that gets pushed away from the rest of its 
tree is on the periphery of the final sentence (who or John). 

However, in (12b), the element being ‘stretched away’, the clitic las, is not 
on the periphery of the clause. The clitic appears somewhere ‘in the middle’ 
of the higher clause. This is therefore a problem for TAG. 
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(13) (a) 



AgrSP 





ti 




4.1 TAG and Clitic Climbing: Previous Approach 

An analysis of clitic climbing in TAG was proposed by Bleam (1994)7 Bleam 
(1994) crucially adopts the idea that the trigger verbs are those which can 
optionally take a ‘defective’ complement, namely VP instead of a full IP (or 
AgrSP, in the split-Infl structure assumed). The clitic is taken to attach to the 
T node, and so when the defective complement VP is selected, the clitic has 
no place to attach in the complement clause, and so must climb up. When the 
trigger verb selects a ‘full’ complement that includes a TP projection, the clitic 
attaches to the T node and so does not climb.® 

For example, (12a), without clitic climbing, is derived by (13b) substitut- 
ing into the TP node of (13a), resulting in (14).^ Since (13b) projects up to TP, 
there is ‘room’ for the clitic, which remains attached in the lower clause. 

For the clitic climbing case (12b), quiere takes a VP complement, as 
shown in (15a). Since the complement is only a VP, the clitic, which must 
attach to a T node, has nowhere to attach, and remains ‘hanging’. The com- 
plement clause is therefore represented by a multi-component tree set, as in 

’The other aspects of restructuring are not discussed. A quite different approach to 
clitic climbing in TAG has recently been proposed by Candito (1999). Space prevents 
discussion here, but it does not alter the main points of this paper. 

^Support for this approach is given by the blocking of clitic climbing by negation, 
on the assumption that negation is located higher than the attachment site of the clitic. 
If the lower clause has negation, then it must therefore also have ‘room’ for the clitic to 
attach. Likewise, if the clitic climbs, then the complement clause is defective and does 
not have room for negation. See Moore (1991), Rosen (1990), Wurmbrand (1998) for 
further arguments for this view. Napoli (1981) discusses some complications for this 
view in Italian. 

^For space reasons, there are some aspects of Bleam (1994)’s analysis that we can- 
not discuss here, such as the need for set-local MCTAG. Although important, they are 
not immediately relevant to the purpose of this paper. 
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(15b), in which one component is the clitic waiting to be attached, and the 
other component is the VP projection. The derivation proceeds by substituting 
the VP component of (15b) into the VP node of (15a), while the clitic com- 
ponent of (15b) adjoins into (15a) at the T node, resulting in (16), with the 
clitic having ‘climbed’. Bleam (1994) assumes that the nonhnite verb moves 
(adjoins) to T when there is a TP projection, which is the case when there is 
no clitic climbing, as in (13b). In contrast, when the clause projects only to 
VP, the verb must stay at V, since there is no T head to adjoin to, as in (15b).*® 

4.2 Some Problems 

An important technical aspect of Bleam (1994)’s analysis is that the clausal 
complementation is done by substitution, not adjoining. Substitution is used 
because the definition of multi-component TAG requires it. The derivation of 
(12b) just described uses a multi-component set (15b) in which one component 
(the VP component) substitutes in, while the other (the clitic) adjoins in. This 
is the scheme shown in (B) in Figure 3. If instead clausal complementation 
was done by adjoining, with the higher clause adjoining at the VP root of the 
lower comer tree, with the las tree adjoining into that higher clause, that would 
be the illegal scheme shown in (C) in Figure 3. 

While clausal complementation can be done in TAG either by adjoining 
or substitution, adjoining must be used when part of the lower clause ends 
up in the higher clause, either through w/i-movement or raising. This is be- 
cause adjoining, but not substitution, allows the necessary ‘stretching apart’ 
of components of a tree. Bleam (1994)’s analysis, with the standard definition 
of TL-MCTAG, therefore makes the prediction that clitic climbing is impossi- 
ble when the higher clause must adjoin, not substitute, and so clitic climbing 
should not occur when the higher verb is a raising verb. 

(17) a. Luis suele comer /flj 

b. Luis las suele comer 

Luis them tends to eat 
‘Luis tends to eat them’ 

To see why this is the case, consider sentence (17b), which shows a clitic 
climbing to a higher trigger verb which is also a raising verb (Aissen and 

*”An issue raised by this analysis is the status of PRO in the complement, which is 
obscured by the fact that [Spec, VP] is not shown, although presumably PRO should 
be there. This problem is not unique to the Bleam (1994)’s TAG analysis, since every 
analysis taking a VP complement must say something concerning this. We return to 
this issue later. 
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Figure 4; Clitic-Climbing with a Raising Verb 




Perlmutter (1983)). Since suele allows clitic-climbing, it is presumably taking 
a VP complement in (17b). But since it is also a raising verb, the subject 
of the sentence, Luis, is part of the comer tree (or tree set). Without worrying 
here about the details of where exactly suele might be adjoining, the derivation 
would be as roughly illustrated in Figure 4. As the figure shows, comer and 
las are part of a multicomponent tree set, and las adjoins into the raising verb, 
suele, which is itself adjoining into the other component of the tree set, Luis 
comer. 

However, this is exactly the derivation structure which is ruled out by 
tree-local multi-component TAG, as in Figure 3C. Therefore, Bleam (1994)’s 
analysis, using tree-local multi-component TAG, predicts that such a case will 
not occur, although in fact cases such as (17b) are indeed acceptable (the same 
is true for Italian). 

One possibility which seems reasonable is that the derivation could be 
handled by using a tree set for the comer clause as in (18b). The derivation 
would proceed by the comer component of ( 1 8b) substituting into the VP node 
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( 19 ) 



AgrSP 




Luis Agrs’ 






of (18a), while the las component of (18b) adjoins at the T node of (18a), with 
the AgrS’ component of (18b) fitting ‘on top’ of (18a), resulting in (19). This 
last step is however again a technical difficulty for TAG. 

However, the intuition behind this approach is essentially correct, we 
think, and the rest of the paper can be viewed as working out of this intuition, 
and linking it to other problems that have been identified for basic TAG. 

An examination of this example also points out an interesting issue con- 
cerning the idea of trigger verbs taking ‘defective complements’. Consider 
again example (17b). By Bleam (1994)’s analysis, the clitic climbs when the 
lower clause is defective, missing a tense projection. However, while it is 
missing the tense projection in (17b), it must at the same time also have a 
[Spec, AgrSP] projection. If the VP projection substitutes into the higher tree, 
then it must be the root of the tree that gets substituted in, and so the AgrSP 
projection with Luis would have to be a separate tree in a tree set, perhaps 
an undesirable move. For example, this is the case in (18b), but the need to 
represent the AgrSP projection as a separate tree is clearly an artifact of the 
handling of clitic-climbing, and it would be more desirable to represent the 
AgrSP projection in the same way as in other clauses. 

The same issues arise for the case of clitic climbing with wh-movement, 
the other case which requires clausal complementation by adjoining. To test 
this case, we need a case of a clitic climbing to a higher verb, while another 
argument is extracted. This cannot be tested with a lower verb that takes only 
one argument, since if that argument is cliticized, then it cannot also be ex- 
tracted as a wh-phrase. However, it can be tested with a lower verb that takes 
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two NP arguments, as in the following examples: 



(20) a. 
b. 

(21) a. 
b. 

(22) a. 
b. 



Juan quiere mostrarfe/oj 
Juan wants to show them to you 
Juan telos quiere mostrar 
Que quiere mostrarfe Juan 
What want to-show-to-you Juan 
‘What did Juan want to show to you?’ 
Que te quiere mostrar Juan 
A quien quiere mostrar/oj Juan 
To-whom want to-show-them Juan 
‘To whom did Juan want to show them?’ 
A quien los quiere mostrar Juan 



(20a) has a lower verb with two argument clitics, and both can climb to 
the higher verb, as shown in (20b). The object argument can be wh-moved, 
as shown in (21a), and, crucially, even with this extraction the dative clitic 
can climb to the higher verb (21b). This last sentence is therefore a problem 
for Bleam (1994)’s analysis. Similarly, the accusative clitic can climb to the 
higher verb, while the indirect-object is wh-moved, as in (22ab).’* Note that 
in (21) and (22), in which the complement is supposedly ‘defective’, it seems 
to project up to a [Spec, CP] position. 

We are therefore left with two related problems: 

• There are cases in which the ‘defective complement’ has material (such as 
the subject or wh-item) which ends up above the root of the higher clause. 
This causes a problem for the definition of TL-MCTAG. 

• What does it mean for the complement to be ‘defective’, if in fact it does 
project higher up, to include a subject or wh-item? 

''An analogous point for multi-component TAG and long distance scrambling in 
German with extraction was made earlier by Rambow (1994). 

Also, the same is true for the analogous Italian examples: 

(i) a. Piero voleva spedirme/o 

Piero wanted to send it to me 

b. Piero melo voleva spedir 

c. Cosa voleva spedirmi 
what he wanted to send to me 
What did he want to send to me? 

d. Cosa mi voleva spedire? 
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(24) 




like pizza 



5 TAG Derivation as C-command 

In this section we give a brief summary of the approach to TAG derivations 
taken in Frank and Vijay-Shanker (1998), Frank et al. (1999). This approach 
argues for a reconceptualization of the TAG formalism, in which the elemen- 
tary structures are collections of c-command relations, and the sole combi- 
natory operation is substitution, with adjoining eliminated. Here we give an 
illustration of how this approach solves one problem for TAG, and in the next 
section we discuss how this same approach solves some of the problems pre- 
sented by the data in the previous section. 

A TAG elementary tree is viewed as a collection of c-command relations 
determined by (at least) the following principles (cf. the definitions in Kayne 
(1994)): 

(23) a. A moved element c-commands its trace 

b. A head and its complement c-command one another 

c. A modifier c-commands the phrase it modifies 

d. A specifier c-commands the phrase to which it attaches 

For example, the raising case (1 1) is reinterpreted by viewing the elemen- 
tary trees (1 la-c) as the collections of c-command relations (24a-c), where the 
arrows indicate the c-command relations. 

1 

The lines indicating direct domination are not intended as part of the represen- 
tation, but rather as an aid to the reader in comparing the proposed structure to that 
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The derivation is a monotonic combination of the c-command relations, 
and proceeds by substituting the I’ node of (24b) into the bottom I’ node of 
(24c) (the substitution node). Maintaining the c-command relations results in 
(24a). One way to view this use of substitution is that to like pizza substitutes 
into the seems tree, with John ‘floating’ up to its final resting place. 

5.1 Solving a Long-standing Problem for TAG 

(25) Does John seem to like pizza? 

One long-standing problem for TAG has been the interaction of raising 
and subject-auxiliary inversion, as in (25). By the CETM, does should origi- 
nate in the same elementary tree as seem. However, since the raising auxiliary 
tree adjoins to the I’ node, there is no way to include the auxiliary verb does 
within the seems tree so that it ends up in a position preceding the subject DP 
in the final sentence. That is, adjoining at I’ ‘stretches’ John away from the 
to like pizza, but without allowing the ‘interleaving’ necessary to form (25).*^ 
The c-command approach allows a resolution of this problem, by using the 

standardly assumed. Certain implicit c-command relations, such as that between I and 
subconstituents of VP, are suppressed in this figure, but are assumed to be present. See 
Frank and Vijay-Shanker (1998) for extensive discussion of the properties of structures 
defined in terms of c-command and the relationship between such structures and those 
defined in terms of dominance. 

*^It may be possible to handle this ^/o-support example by other means, such as 
treating the auxiliary and raising verbs as separate trees, not members of a tree set. 
However, the same problem extends to examples of raising/w/i interaction such as (ib), 
in which the experiencer of seems is extracted to the [Spec, CP] position. Here there is 
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( 27 ) 
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seem I 



-►VP 



tl V«DP 
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like pizza 



collections of c-command relations in (26ab). The structure for like in (26a) 
is the same as (24b). For the does/seem structure in (26b), however, does 
is shown as having raised to the C node, which therefore c-commands the I 
node.^'* 

The derivation proceeds by substituting the I’ node of (26a) into the bot- 

no choice but to say that to whom and seem in (ib) are members of the same elementary 
structure, whether a tree set or a set of c-command relations. The solution argued for 
here for (25) also extends to (ib). 

(i) a. John seems to Bill to be crazy 

b. To whomi does John seem tj to be crazy 



'"'Frank (1992) had previously proposed utilizing TL-MCTAG to solve this problem, 
using the tree set in (i). The c-command approach allows a cleaner representation of 
this solution. 



(i) (a) 



Ci 

I 

does 
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tom r node of (26b). In the resulting structure, the does and John fragments 
must both c-command I’, and so (27) is consistent with maintaining the c- 
command relations, and gives the desired derivation. 

However, the substitution of to like pizza into the lower I’ node of seems 
does not in fact fully determine the result shown in (27). The relative c- 
command relation of does and John is not determined — all that is known is 
that they both must c-command the I’ node headed by the trace of does. How- 
ever, if the IP that is the complement of the C node headed by does is the same 
as the IP parent of John, then the result must be as shown, with the inversion 
forced. The intuition is that there cannot be two IP nodes among the ‘floating’ 
components, where the John and does segments can be considered ‘floating’. 
Condition (28) was therefore proposed in Frank et al. (1999), with a precise 
characterization of ‘floating components’ left open. 

(28) Derivational CETM: The floating components of a derivation may 
constitute exactly one extended projection. 

5.2 Unboundedness 

Certain issues arise when multiple levels of embedding are considered, as in 
(29). The elementary structure headed by certain would be (30c), with the like 
and does/seem structures in (30ab) the same as before. 

(29) Does John seem to be certain to like pizza? 

There are a number of ways this derivation could proceed. For exam- 
ple, (30b) and (30c) could combine first, or (30a) and (30b) could combine 
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(33) (a) IP (b) r (c) IP 




certain 



first. Only one of these derivations was allowed in Frank et al. (1999), by the 
proposed condition on derivations (31): 

(31) The structure containing the substitution node must be elementary 
(that is, not the product of a derivation). 

With this constraint, the derivation must proceed by substituting the F 
node of (30a) into the bottom F node of (30c), resulting in a structure for John 
to be certain to like pizza. The final step in the derivation substitutes the upper 
F node from this structure {to be certain to like pizza) into the bottom F node 
of (30b), resulting in the desired derivation. This derivation can be viewed as 
allowing John to ‘float up’ past the certain and seem clauses. 

5.3 Locality Constraints 

As just described, the recharacterization of TAG as c-command relations al- 
lows components of an elementary structure to be viewed as ‘floating’ up 
through the derivation. It is important, however, that such such components 
not be allowed to float ‘too far’. For example, consider a ‘superraising’ case, 
as in (32). 

(32) * John seems it is certain to like pizza 

The derivation could proceed by substituting the F node of (26a), repeated 
here as (33a), into the bottom foot node of (33c), allowing John to ‘float up’. 
At this point in the derivation, there will be two floating items, both specifiers 
for IP {John and it), with no c-command relations between them. The deriva- 
tional CETM therefore applies, forcing these two IP nodes be identified, since 

er|c 58 
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( 34 ) 



IP 




IS 



certain to like pizza 



otherwise it would constitute two distinct extended projections. The resulting 
structure is therefore as shown in (34). 

John is prevented from floating too far by the application of the deriva- 
tional CETM, which forces the IP node of John to be identified with the IP 
node of it, thus causing an illegal configuration.'^ 

6 Fixing the Problems 

6.1 The Basic Case: Optional Clitic Climbing with One Tlrigger Verb 

Adopting the TAG-as-c-command approach described in the previous section 
allows the derivation of the problematic cases, as we illustrate with the raising 
case (17b), repeated here as (35b). 

(35) a. Luis suele comer/a^ 

b. Luis las suele comer 



A possible derivation is shown in (36). The structure for the comer clause 
is shown in (36a). The subject, Luis, has moved to [Spec, AgrsP], and thus 
must c-command [Spec, VP], as shown, although the TP projection is not 
projected in (36a). The clitic las is shown as having moved from the object 
position, and thus must c-command its trace. In addition, the representation 
shows that it must adjoin to a T projection, although there isn’t one in the 
comer clause (and so also the verb stays at the V node). The raising verb 

'^There are different ways of ruling this an illegal configuration, although the most 
obvious are a violation of the extended projection principle or of lack of Case for both 
NPs. 



Luis them tends to eat 
‘Luis tends to eat them’ 
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(36) 



(a) 



AgrsP 




(b) 



DP, -► Agrs’ 





suele 



V^VP 



takes a VP complement, as in (36b). The derivation proceeds by the VP 
node of (36a) substituting into the bottom VP node of (36b). Maintaining 
the c-command relations gives the result in (37), with las attached to the T 
node for suele, as desired. To avoid clutter, we have not explicitly shown the 
c-command relations in (37). 

This derivation maintains the basic idea of Bleam (1994)’s approach, in 
which the trigger verb can optionally take a VP complement, getting the clitic 
when it does so. The use of substitution together with identification of the 
‘floating components’ solves the problem of how the comer clause projects 
‘up to’ AgrsP without including a TP projection, by using the c-command 
relations to allow Luis to raise to [Spec, AgrsP], without TP being specified at 
all, while still allowing substitution of the VP node into the suele clause. 

Before discussing further the issues raised by this approach, we illustrate 
the derivation with no clitic climbing. The tree for the lower clause in the case 
of no clitic climbing is shown in (38a). In this case, the lower clause projects 
the tense projection, the verb moves to the T node, and the clitic attaches. We 
assume that the trigger raising verb takes an AgrS’ foot node, as in (38b), when 
there is no clitic climbing. The derivation proceeds by the AgrS’ node of (38a) 
substituting into the bottom AgrS’ node of (38b), resulting in (39). 

*®Since the suele clause does not select for a subject, neither [Spec, TP] nor [Spec, 
VP] are projected. Thus, there is no difference between TP and T’, or VP and V’, and 
so the intermediate projections have been eliminated. 
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( 37 ) 



AgrSP 




(38) (a) 



AgrsP 



(b) 



Agrs’ 
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6.2 Handling the Unboundedness of Clitic Climbing 

For a somewhat more complicated example, consider how the unboundedness 
of clitic climbing can be accounted for. Sentence (8), repeated here as (40), is 
a case of a clitic climbing over two trigger verbs, quiere and poder. Both of 
these trigger verbs are control verbs, and so the structure of the derivation is 
somewhat different from that with the the raising trigger verb suele. 

(40) Juan la quiere poder comprar 
Juan it wants to be able to buy 
‘Juan wants to be able to buy it’ 

Just as in Bleam (1994)’s analysis, quiere (41a) heads a clause that is 
taking a VP complement. The clauses for poder (41b) and comprar (41c) both 
project to VP, thus forcing the clitic to climb. 

Following the restriction (3 1 ) on derivations discussed earlier, the deriva- 
tion proceeds by substituting the VP node of (41c) into the bottom VP node of 

17 

Actually, it s not so clear that poder is a control verb, and there is also some ev- 
idence that in Spanish quiere should be treated as a raising verb. We leave this issue 
aside for now, and assume that these are control verbs. 
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V VP 



T 




(41b), resulting in (42). 

Since there is no T projection in (42), the clitic is left ‘floating’. The top 
VP node of (42) is then substituted into the bottom VP node of (41a), resulting 
in (43), with the clitic then able to attach to the T node of the quiere clause. 



6.3 Floating Components and Identified Extended Projections 



Recall that in the c-command approach, a ‘Derivational CETM’ (DCETM) 
(28) was put forward as a way to control the movement of the ‘floating’ com- 
ponents, while leaving vague the definition of the floating components, al- 
though the intuitive sense was hopefully apparent. The DCETM was shown in 
Section 5 to have two effects in the examples discussed there: 



• In the derivation (26) of Does John seem to like pizza?, after to like pizza 
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substitutes into seems, the two floating components are John and does, 
both of which refer to an IP projection. By the DCETM, the two IP pro- 
jections must be identified, thereby fixing the order of does and John. 

• In the potential derivation (33) of the unacceptable super-raising case (32), 
after is certain and to like pizza combine, John and it are the two floating 
components. Since they both refer to an IP projection (both being spec- 
ifiers of IP), and both are ‘floating’, by the DCETM they must both be 
specifiers of the same IP, resulting in the invalid (for independent reasons) 
structure (34). 

As the super-raising case in particular shows, these conditions on attach- 
ment are very reminiscent of the ‘shortest move’ type of restrictions from work 
in the Minimalist framework. The clitics in effect need to attach to a T node 
as soon as there is a T node to attach to, just as the floating John had to attach 
to an IP node as soon as one was available in the derivation of the superraising 



O 
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bridge 
verbs . 




Figure 5: Capturing Different Types of Inter-Clausal Movement in TAG 



case (32), thus preventing it from floating ‘too far’ . It would be disappoint- 
ing, however, if such a ‘shortest move’ restriction had to be imposed in a TAG 
framework, since one of the claims of work in this framework (e.g., Frank and 
Kroch (1995)), is that by utilizing clause-sized elementary structures instead of 
the single-level items taken from the ‘numeration’, such stipulations as ‘short- 
est move’ can be eliminated. It is therefore quite nice that the DCETM, by 
‘identifying’ the nodes of the ‘floating’ components, accomplishes the same 
effect (at least in the examples under discussion here). 

While there are different ways that one might characterize the ‘floating 
elements’, in the next section I will suggest that by reintroducing the crucial 
place of recursive structures in the framework, we can give a relatively simple 
characterization of ‘floating components’, one that is completely natural for 
the TAG approach. We also discuss how this accounts for further locality 
issues with clitic climbing. 

7 Recursive Structures and floating Elements’ 

Consider again how wh-movement and subject-to-subject raising are handled 
in TAG, as discussed in Section 3. The bridge verbs adjoin in as C’ recursive 
structures. Since the w/i-moved items are at [Spec, CP], they are high enough 
to be ‘stretched away’ by the bridge verb adjoining in. Similarly, subjects 
in [Spec, IP] get ‘stretched away’ by the raising verb adjoining in as an I’ 
recursive structure. Since the raising verb adjoins low, the subject doesn’t 
have to move as high as the w/i-item.’® Although these standard examples of 

'^Indeed, aside from [Spec, VP] to [Spec, IP], if the VP-intemal-subject-hypothesis 
is adopted, it doesn’t have to move at all. 
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Figure 6: Inter-Clausal Movement in TAG, Revised 



A’ and A inter-clausal movement have quite different properties, in TAG they 
are handled by the same mechanism, adjoining, with the differences in their 
properties arising from the different loci of adjoining. The picture therefore 
looks roughly like that in Figure 5, which illustrates the schema of the bridge 
and raising verbs adjoining in. 

However, the discussion of the does/seem case (25) shows that even for 
the raising case, this picture is not fully accurate. The higher clause seems 
does not consist only of a recursive T structure, but also of some structure 
above the F node, namely does. This suggests a way to characterize what the 
‘floating elements’ are in the c-command recharacterization of TAG. While 
space prevents going into the technical details, the basic idea is to bring back 
into this framework the fundamental place of recursive structures. 

A review of the examples discussed so far shows that in cases of ‘floating 
elements’, the higher clause (that the lower clause substitutes into) has a recur- 
sive component, and additional elements that c-command the top of that recur- 
sive component that are subject to the DCETM. The elements that c-command 
the substitution site in the lower clause are also subject to the DCETM. 

For example, the does. ..seem clause (26b) for Does John seem to like 
pizza (25) has an F -recursive component. The element c-commanding the 
recursive part of the seems clause (that is, does) is considered ‘floating’ and 
subject to the DCETM. The element c-commanding the substitution site F in 
the like clause (26a) is also considered ‘floating’ and subject to the DCETM. 
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Figure 7: Trigger (Clitic-Climbing) Verbs 



The DCETM causes The IP nodes for these two floating elements to be identi- 
fied. The ‘floating’ components are therefore the part of the higher clause that 
c-commands the higher recursive node, and the part of the lower clause that 
c-commands the node that gets substituted into the higher clause. Similarly, 
for the superraising case (32), the it is certain clause (33c) has an I’ recursive 
part, with only it c-commanding the recursive part, and so it was considered 
one of the ‘floating’ elements. 

The revised picture is shown in Figure 6. Note that the importance of the 
‘floating’ component of the higher clause is obscured by looking at the bridge 
verb case because there is no room for any structure above the C’ node to 
be floating, because all that’s left is the [Spec, CP] node. The importance of 
the does/seem case is that shows how there must be stuff above the recursive 
component that ‘merges’ in. 

There may in fact be some advantages in redefining the notion of deriva- 
tion to explicitly use the notion of adjoining plus ‘identifying’ the nodes in 
the floating components, although that cannot be discussed here. While the 
exact details of how those floating components are unified can be handled in 
a number of different ways, the important point is the overall structure of the 
derivation, and how the ‘Derivational CETM’ results in the desired constraints 
on movement. 
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7.1 Clitics and Locality 

Now, consider again the place of clitic climbing with the ‘trigger verbs’ in this 
context. A look at the derivations in the previous section shows that they all 
have a VP recursive part, with some (but not all), having additional material 
above. So along with the ‘defective complements’ for raising and bridge verbs 
with the corresponding ‘floating material’, we have the picture in Figure 7. 

For example, in the derivation for Luis suele las comer (36), the suele 
clause (36b) has a VP recursive component into which substitutes the VP node 
of the Luis las comer clause (36a). In order for the clitic to be ‘floating’, it just 
has to be specified that T c-commands not only the trace of the clitic, but also 
the VP node as well. Similarly, Luis in (36a) must be ‘floating’ as well, since it 
c-commands the substitution site, VP. In (36b), all the material c-commanding 
the VP recursive structure is ‘floating’; namely, Agrs and suele, as desired. 
The identification of nodes in the floating components gives the desired result. 
In the case without clitic climbing (38a), since the higher clause is simply an 
Agrs’ recursive structure, the locus of substitution, Agrs’, c-commands the 
clitic, and so the clitic is not floating.*^ 

(44) a. Juan cree que Luis quiere comprar/a 
Juan believes that Luis wants to buy it 
Juan believes that Luis wants to buy it 

b. Juan cree que Luis la quiere comprar 

c. * Juan la cree que Luis quiere comprar 

We now consider some issues regarding how far the clitics can ‘float’, 
and how it is handled by the scheme just discussed. While the derivation in 
the previous section allows the clitic to ‘float’ up to derive the clitic-climbing 
case, we do not want to allow the clitic to float ‘too far’ . For example, suppose 
that las in (37) does not attach to suele, but rather remains ‘floating’. It could 
then continue to float up to a higher clause, perhaps one headed by non-trigger 
verb, which would not be acceptable. A case of this type is (44c), in which the 
clitic la has moved past the quiere clause. 

Consider first the derivation of the acceptable (44b). Since the clitic 
climbs to the quiere clause, this means that the quiere clause takes a VP com- 
plement. The quiere clause, (46a), is therefore the same as (41a), except that 
it also includes a CP node with que in the complementizer position. The cree 
clause (45) of course takes a CP complement. 

*®Also, if in (38a) the clitic and verb are in a mutual c-command relation, then the 
clitic could not move up without disrupting the c-command relations, thus violating the 
required monotonicity of the derivation. 
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The derivation proceeds by substituting the VP node of (46b) into the 
lower VP node of (46a). The ‘floating components’ are therefore the clitic in 
(46b) and the structure c-commanding the higher VP in (46a). The T nodes 
of quiere and la are therefore identified, resulting in structure (47). Then (47) 
substitutes into the CP node of (45). Since there is no structure c-commanding 
the CP node of (47), there is no material to ‘float’, and the clitic cannot climb 
any further, and so (44b) cannot be derived. 

7.2 Intersecting Clitic Climbing 

One interesting case of restrictions on clitic movement is what Aissen and 
Perlmutter (1983) referred to as ‘intersecting clitic climbing’. In all of the 
examples discussed so far, the trigger verbs are either raising or subject-control 
verbs. In Spanish, however, there are also some object-control verbs which 
allow clitic climbing, such as permitir, as in (48ab).^® 

(48) a. Juan le permitio arreglar/a (a Pedro) 

Juan allowed Pedro to repair it 
b. Juan se la permitio arreglar (a Pedro) 

^°The le clitic in (48a) is changed to se when it appears with la, the ‘spurious-je’ 
rule. This does not matter for our purposes here. 
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quierci 
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V VP 
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(49) a. Mari quiere [ permitir te [ ver lo ]] 

Mari wants [ to permit you [ to see it ]] 

‘Mari wants to permit you to see it’ 

NOMl VI [ V2 DAT2 [ V3 ACC3 ]] 

b. Mari quiere [ permitir te loj [ ver ij ]] 

NOMl VI [ V2 DAT2 ACC3j [ V3 tj ] ] 

c. Mari te* loj quiere [ permitir t* [ ver tj ]] 

NOMl DAT2i ACC3j VI [ V2 ti [ V3 tj ] ] 

d. * Mari tei quiere [ permitir ti loj [ ver tj ]] 

NOMl DAT2i VI [ V2 ti ACC3j [ V3 tj ] ] 

e. * Mari loj quiere [ permitir te [ ver tj ]] 

NOMl ACC3j VI [ V2 DAT2 [ V3 tj ] ] 



Things get interesting when there are two trigger verbs, one a control verb, 
such as quiere, and one a verb such as permitir, as in (49).^' It is possible for a 
clitic from the permitir clause and one from the lowest clause to both climb all 
the way to the highest clause, as in (49c). It is also possible for the clitic from 
the lowest clause to climb to the middle clause, as in (49b). However, when 
this is done the clitics appear to be ‘stuck together’ It is not possible for the 
lowest clitic to climb over the middle one, as in (49e), nor for the clitic from 
the middle clause to climb to the higher clause while the one from the lowest 
clause climbs to the middle clause, as in (49d). 

^*The bottom lines are meant to help illustrate the pattern — ^NOMl refers to the nom- 
inative argument of the first (highest) verb, etc. Also, the clitics have been written 
separately from the infinitival verb to better show what has moved where. 

^^Bleam (1994) referred to this as a ‘bandwagon’ effect. 
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(51) (a) TP 




We now show how this falls out of the structure of a derivation as dis- 
cussed so far. The case with no clitic climbing, (49a), is straightforward. Both 
quiere and permitir take complements that are ‘bigger’ than just VP projec- 
tions. Most likely these are CP projections, but to avoid clutter we will just 
show them as TP projections, although it doesn’t matter for present purposes. 
The point is that the complement is ‘non-defective’ -enough so that the clitics 
do not climb. To derive (49a), therefore, the structures in (50) and (51) are 
used, with (51b) substituting into (51a), with the result substituting into (50). 

To derive (49b), the ver clause must be a VP, so that the clitic is forced 
to climb. At the same time, the clitic te from the permitir clause does not 
climb, and so the permitir clause is a TP clause. Therefore the structures in 
(52) and (53) are used. The VP node of (53b) substitutes into the bottom VP 
node of (53a). At that point the lo clitic and the structure above the higher VP 
node in (53a) are ‘unified’ — that is, the T nodes are identified, and so the clitic 
structure lo must attach to the te permitir structure. The clitics are both now 
‘stuck’ on permitir. 

Similarly, to derive (49c), the structures in (54) and (55) are used, in which 
both permitir and ver project only to VP. As is hopefully clear, this forces both 
clitics to climb to the quiere clause. 

Now consider the unacceptable cases (49de). For (49d), since te climbs 
from the permitir to the quiere clause, then the permitir clause must project 
only to VP, with quiere taking a VP complement. Since lo climbs from the 
ver clause to the permitir clause, it must project only to VP, and the permitir 
clause takes a VP complement. But then this are the same structures as used 
to derive (49c), and so (49d) cannot be derived. 
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For (49e), since te does not climb out of the permitir clause, the permitir 
clause must project higher than VP, TP in this example. Since lo climbs out of 
the ver clause, the latter must only project to VP Therefore the permitir clause 
is (53a), and the ver clause is (53b)). These are the same structures used to 
derive (49b), and so (49e) cannot be derived. Once the VP node of (53b) 
substitutes into the bottom VP node of (53a), the T nodes are ‘identified’, and 
lo is attached to permitir teP 

In short, the use of the Derivational CETM, forcing the ‘floating com- 
ponents’ to be identified, allows the desired locality constraints to be pre- 
served. Again, it is accomplishing the same effect as a ‘shortest move’ type 
constraint.^'^ 

23 

For space reasons, we have left out one additional case, (i), in which the middle 
clitic climbs to the highest clause, while the lowest clitic remains with the lowest clause. 

(i) Mari ta quiere [ permitir ti [ ver lo ]] 

NOMl DAT2i VI [ V2 t, [ V3 ACC3 ]] 

This is acceptable, and can be derived without a problem. The quiere clause takes a 
VP complement, forcing te to climb up, and so the structures (54) is used for the quiere 
clause. The permitir clause takes a structure like (55a), but with the difference that its 
complement is TP, not VP. Therefore the clause for ver is (51b), and lo stays in the ver 
clause. 

^‘’Space prevents further discussion here, but Bleam (1994) used the locality prop- 
erties of the TAG variant called ‘set-local TAG’ to derive the unacceptability of the 
violations of (d) and (e). The fact that the the system described here accomplishes the 
same result suggests that the crucial issue has not the particular features of set-local 
TAG, but rather the typology of the trees, as long as the derivational machinery can 
take advantage of their properties. 
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(57) (a)AgrsP (b) T 




ti 



er|c 



(56) Luis la quiere comprar 
Luis it wants to buy 
‘Luis wants to buy it’ 

We end with a brief comment on the problem of licensing PRO in the VP 
complement. The use of ‘identifying’ the floating elements allows for what 
may be an interesting approach to this problem. Suppose that in a sentence 
such as (56), quiere takes a V’, not a VP, complement, shown in (57a), al- 
though the complement clause (57b) is still a VP clause. The difference now 
is that the V’, not the VP, node of (57b) substitutes into (57a) at its bottom 
V’ node. Now all the material that c-commands V’ in (57b), namely the PRO 
specifier of VP and the clitic, and the material that c-commands the higher V’ 
node in (57a) are subject to the DCETM. This has the effect of making both 
PRO from the comprar clause and the tk trace of the subject from the quiere 
clause both be specifiers of the same VP, resulting in (58). 

This is the same situation as in the superraising case (34), except that there 
it was multiple IP specifiers, rather than multiple VP specifiers here, where one 
specifier is a PRO. It is easy to imagine a story whereby PRO can be licensed 
in this configuration, by getting coindexed with the other specifier of VP. We 
leave for future work the exact working out of this account. It is encouraging 
to note, though, that this possibility follows from the derivational machinery 
used so far.^^ 



^^There is some similarity between this ‘multiple [Spec, VP]’ approach to defective 
complements and the ‘movement-to-[Spec, VP]’ approach of Boskovic (1994). We 
leave for future work a comparison of these approaches. 
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comprar 



8 Conclusion 



In this paper we have discussed an analysis of clitic climbing with the frame- 
work of Tree Adjoining Grammar. We showed that the analysis proposed by 
Bleam (1994) is inadequate for certain cases, in particular those in which the 
trigger verb is a raising or bridge verb. We discussed how by adopting the 
reconceptualization of TAG as monotonic c-command as proposed elsewhere, 
these problems can be overcome. This leads naturally to a conception of inter- 
clausal movement in TAG in which an internal node of the lower clause sub- 
stitutes into a node of the higher clause with the higher parts of each clause 
‘merging’ together, in the sense discussed. The derivational structure is the 
same for all types of inter-clausal movement — as discussed in this paper, for 
wh-movement, raising, and clitic-climbing. The differences in their properties 
arise from the differing loci of substitution, and the consequences of that loca- 
tion for movement in the structure for the lower clause (how far an NP has to 
move to be above the locus of substitution). 

There are a number of issues related to this work that require further in- 
vestigation. The most immediate is a precise characterization of the ‘floating’ 
elements and how they are ‘identified’ by the DCETM. The view suggested 
here based on the recursive structure of the higher clause may be useful, or 
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it may be possible to characterize the floating components solely in terms of 
how ‘loose’ they are in the c-command relations. 

There are several details regarding the analysis of clitic-climbing that need 
to be cleaned up. One is the issue of the clitic-verb order. A second issue, 
perhaps more serious and certainly more interesting, concerns how the clausal 
structures differ in the clitic and non-clitic-climbing cases. Ideally, we would 
like for there to be just a ‘one bit’ difference between the two cases — one 
parameter is changed, and clitic climbing either occurs or not. In particular, 
we would like to say that if the higher clause takes a different size complement 
(VP or TP), then clitic-climbing either does or does not occur. However, in the 
analysis described here, the complement size taken by the higher clause must 
correlate with the structure of the lower clause. That is because, given the 
assumption that the verb moves to T, the infinitival lower verb moves to T 
when there is no clitic climbing, and does not move to T when there is. If the 
higher clause selects a VP complement, then the lower clause must not have 
a TP projection, and the clitic must be ‘floating’ by itself. If the lower clause 
did have a TP projection, then the verb would move to that T projection, and 
then both the clitic and the lower verb would end up above the higher verb, 
obviously undesirable. The most obvious way to fix this problem is to modify 
the placement of the clitic to be above the place where the infinitival verb 
moves. Then the desired result could obtain in which the lower infinitival 
clause is always the same, with the appearance of the clitic in the lower or 
higher clause dependent only on the size of the complement selected by the 
higher verb. 

One further area of work is the investigation of how other problematic 
cases of long-distance movement, such as long distance scrambling in Ger- 
man (Rambow (1994)), should be integrated into this approach. Of particular 
interest is whether such scrambling follows the pattern of ‘intersecting clitic 
climbing’ as in (49) and if not, how the different patterns of movement can 
be integrated into this approach without altering the basic derivational mecha- 
nism. 

Also, the other aspects of ‘restructuring’ in Romance, such as the ‘long 
middle-^/’, should be integrated into this approach. For reasons that can’t 
be discussed here, these other aspects raise different challenges for TAG (see 
Kulick (1998) for discussion). There is also an interaction between these other 
aspects and clitic climbing that is important to capture. 
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The Convergence of Lexicalist Perspectives in 
Psycholinguistics and Computational Linguistics* 

Albert E. Kim, Bangalore Srinivas and John C. Trueswell 

1 Introduction 

In the last fifteen years, there has been a striking convergence of perspectives 
in the fields of linguistics, computational linguistics, and psycholinguistics 
regarding the representation and processing of grammatical information. 
First, the lexicon has played an increasingly important role in the representa- 
tion of the syntactic aspects of language. This is exemplified by the rise of 
grammatical formalisms that assign a central role to the lexicon for charac- 
terizing syntactic forms, e.g., LFG (Bresnan and Kaplan 1982), HPSG (Pol- 
lard and Sag 1994), CCG (Steedman 1996), Lexicon-Grammars (Gross 
1984), LTAG (Joshi and Schabes 1996), Link Grammars (Sleator and Tem- 
perley 1991) and the Minimalist Program (Chomsky 1995). Second, theories 
of language processing have seen a shift away from ‘rule-governed’ ap- 
proaches for grammatical decision-making toward statistical and constraint- 
based approaches. In psycholinguistics, this has been characterized by a 
strong interest in connectionist and activation-based models (e.g., Lewis 
1993, McRae, Spivey-Knowlton and Tanenhaus 1998, Stevenson 1994, Ta- 
bor, Juliano and Tanenhaus 1996). In computational linguistics, this is found 
in the explosion of work with stochastic approaches to structural processing 
(cf. Church and Mercer 1993, Marcus 1995). In linguistics, this interest is 
most apparent in the development of Optimality Theory (Prince and Smolen- 
sky 1997). 

In this paper, we highlight how the shift to lexical and statistical ap- 
proaches has affected theories of sentence parsing in both psycholinguistics 
and computational linguistics. In particular, we present an integration of 
ideas developed across these two disciplines, which builds upon a specific 
proposal from each. Within psycholinguistics, we discuss the development of 
the Constraint-Based Lexicalist (CBL) theory of sentence processing (Mac- 
Donald, Pearlmutter and Seidenberg 1994, Trueswell and Tanenhaus 1994). 



*This work was partially supported by National Science Foundation Grant SBR- 
96-16833; the University of Pennsylvania Research Foundation; and the Institute for 
Research in Cognitive Science at the University of Pennsylvania (NSF-STC Coop- 
erative Agreement number SBR-89-20230). The authors thank Marian Logrip for 
assistance in the preparation of this paper and thank Paola Merlo, Suzanne Stevenson, 
and two anonymous reviewers for helpful comments on the paper. 
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Within computational linguistics, we discuss the development of statistical 
approaches to processing Lexicalized Tree-Adjoining Grammar (LTAQ 
Joshi and Schabes 1996). Finally, we provide a description of the CBL the- 
ory, which is based on LTAG. 

2 A Constraint-Based Theory of Sentence Processing 



Psycholinguistic thinking about the syntactic aspects of language compre- 
hension has been deeply influenced by theories that assign a privileged role 
to supra-lexical syntactic representations and processes. This view has been 
most extensively developed in the theory of Frazier (1979, 1989), which 
proposed that syntactic processing is controlled by a two-staged system. In 
the first stage, a single syntactic representation of the input is computed us- 
ing a limited set of phrase structure rules and basic grammatical category 
information about words. When syntactic knowledge ambiguously allows 
multiple analyses of the input, a single analysis is selected using a small set 
of structure-based processing strategies. In a second stage of processing, the 
output of this structure-building stage is integrated with and checked against 
lexically specific knowledge and contextual information, and initial analyses 
are revised if necessary. The basic proposal of this theory — that syntactic 
processing is, at least in the earliest stages, independent from lexically spe- 
cific and contextual influences — has been one of the dominant ideas of sen- 
tence processing theory (e.g., Ferreira and Clifton 1986, Perfetti 1990, 
Mitchell 1987, 1989, Rayner, Carlson and Frazier 1983). 

A diverse group of recent theories has challenged this two-stage struc- 
ture-building paradigm by implicating some combination of lexical and 
contextual constraints and probabilistic processing mechanisms in the earli- 
est stages of syntactic processing (Crocker 1994, Corley and Crocker 1996, 
Ford, Bresnan and Kaplan 1982, Gibson 1998, Jurafsky 1996, MacDonald et 
al. 1994, Pritchett 1992, Stevenson 1994, Trueswell and Tanenhaus 1994). 
We focus in this paper on the body of work known as the Constraint-Based 
Lexicalist theory (MacDonald et al. 1994, Trueswell and Tanenhaus 1994), 
which proposes that all aspects of language comprehension, including the 
syntactic aspects, are better described as the result of pattern recognition 
processes than the application of structure building rules. Word recognition 
is proposed to include the activation of rich grammatical structures (e.g., 
verb argument structures), which play a critical role in supporting the se- 
mantic interpretation of the sentence. These structures are activated in a pat- 
tern shaped by frequency, with grammatically ambiguous words causing the 
temporary activation of multiple structures. The selection of the appropriate 
structure for each word, given the context, accomplishes much of the work 
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of syntactic analysis. That is, much of the syntactic ambiguity in language is 
proposed to stem directly from lexical ambiguity and to be resolved during 
word recognition.’ The theory predicts that initial parsing preferences are 
guided by these grammatical aspects of word recognition. 

The CBL framework can be illustrated by considering the role of verb 
argument structure in the processing of syntactic ambiguities like the Noun 
Phrase / Sentence Complement (NP/S) ambiguity in sentences like (la) and 
(lb). 

(1) a. The chef forgot the recipe was in the back of the book. 

b. The chef claimed the recipe was in the back of the book. 

In (la), a temporary ambiguity arises in the relationship between the noun 
phrase the recipe and the verb forgot. Due to the argument structure possi- 
bilities ioi forgot, the noun phrase could be the direct object or the subject of 
a sentence complement. In sentences like this, readers show an initial prefer- 
ence for the direct object interpretation of the ambiguous noun phrase, re- 
sulting in increased reading times at the disambiguating region was in... 
(e.g.. Holmes, Stowe and Cupples 1989, Ferreira and Henderson 1990, Ray- 
ner and Frazier 1987). On the CBL theory, the direct object preference in 
(la) is due to the lexical representation of the verb forgot, which has a strong 
tendency to take a direct object rather than a sentence complement. The CBL 
theory proposes that word recognition includes the activation of not only 
semantic and phonological representations of a word, but also detailed syn- 
tactic representations. These lexico-syntactic representations, and the proc- 
esses by which they are activated, are proposed to play critical roles in the 
combinatory commitments of language comprehension. The preference for 
the direct object in (la) should therefore be eliminated when the verb forgot 
is replaced with a verb like claimed, which has a strong tendency to take a 
sentence complement rather than a direct object. These predictions have been 
confirmed experimentally (Trueswell, Tanenhaus and Kello 1993, Garnsey, 
Pearlmutter, Myers and Lotocky 1997), and connectionist models have been 
constructed which capture these preferences (Juliano and Tanenhaus 1994, 
Tabor etal. 1996). 

Experimental work has also indicated that the pattern of processing 



‘The amount of syntactic structure that is lexically generated goes beyond the 
classical notion of argument structure. In lexicalized grammar formalisms such as 
LTAG, the entire grammar is in the lexicon. For instance, the attachment site of a 
preposition can be treated as a lexically specific feature. Noun-attaching prepositions 
and verb-attaching prepositions have different senses. We will discuss this in further 
detail in the following sections. 
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commitments is not determined solely by individual lexical preferences, but 
involves an interaction between argument structure preference and lexical 
frequency. NP-biased verbs result in strong direct object commitments re- 
gardless of the lexical frequency of the verb. S-bias verbs, on the other hand, 
show an effect of frequency, with high frequency items resulting in strong S- 
complement commitments and low frequency items resulting in much 
weaker S-complement commitments (Juliano and Tanenhaus 1993, though 
see Garnsey et al. 1997). This interaction between frequency and structural 
preference is explained by Juliano and Tanenhaus (1993) as occurring be- 
cause the argument structure preferences of S-bias verbs must compete for 
activation with the regular pattern of the language — that an NP after a verb is 
a direct object. The ability of the S-bias verbs to overcome this competing 
cue depends upon frequency. Juliano and Tanenhaus (1994) present a 
connectionist model that shows that such interactions emerge naturally from 
constraint-based lexicalist models, since the models learn to represent more 
accurately the preferences of high frequency items. In later sections, we re- 
turn to the issue of interactions between lexical frequency and ‘regularity’ 
and discuss its implications for the architecture of computational models of 
language processing. 

The CBL theory has provided an account for experimental results in- 
volving a wide range of syntactic ambiguities (e.g., Boland, Tanenhaus, 
Garnsey and Carlson 1995, Garnsey et al. 1997, Juliano and Tanenhaus 

1993, Trueswell and Kim 1998, MacDonald 1993, 1994, Spivey-Knowlton 
and Sedivy 1995, Trueswell et al. 1993, Trueswell, Tanenhaus and Garnsey 

1994, cf. MacDonald et al. 1994). As this body of experimental results has 
grown, there has been a need to expand the grammatical coverage of com- 
putational modeling work to match that of the most comprehensive descrip- 
tions of the CBL theory, which have been wide in scope, but have not been 
computationally explicit (MacDonald et al. 1994, Trueswell and Tanenhaus 
1994). Existing computational models have focused on providing detailed 
constraint-based accounts of the pattern of processing preferences for par- 
ticular sets of experimental results (McRae et al. 1998, Tabor et al. 1996, 
Spivey-Knowlton 1996, Juliano and Tanenhaus 1994). These models have 
tended to be limited syntactic processors, with each model addressing the 
data surrounding a small range of syntactic ambiguities (e.g., the NP/S am- 
biguity). This targeted approach has left open some questions about how 
CBL-based models ‘scale up’ to more complicated grammatical tasks and 
more comprehensive samples of the language. For instance, the Juliano and 
Tanenhaus model learns to assign seven different verb complement types 
based on co-occurrence information about a set of less than 200 words. The 
full language involves a much greater number of syntactic possibilities and 
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more complicated co-occurrence relationships. It is possible that the com- 
plexities of computing the fine-grained statistical relationships of the full 
language may be qualitatively greater than in these simple domains, or even 
intractable (Mitchell, Cuetos, Corley and Brysbaert 1995). It is also possible 
that these targeted models are so tightly focused on specific sets of experi- 
mental data that they have acquired parameter settings that are inconsistent 
with other data (see Frazier 1995). Thus, there is a need to examine whether 
the principles of the theory support a model that provides comprehensive 
syntactic coverage of the language but which still predicts fine-grained pat- 
terns of argument structure availability. 

3 Lexicalized Grammars and Supertagging 

In developing a broader and more formal account of psycholinguistic find- 
ings, we have capitalized on a convergence between the CBL movement in 
psycholinguistics and similar movements in theoretical and computational 
linguistics. Theoretical linguistics has increasingly treated the lexicon, rather 
than supra-lexical rules, as the repository of syntactic information, giving 
rise to “lexicalist” grammars (Bresnan and Kaplan 1982, Pollard and Sag 
1994, Joshi and Schabes 1996, Steedman 1996). In a parallel development, 
computational linguistics has produced an extensive body of work on statis- 
tical techniques for ambiguity resolution such as part-of-speech tagging and 
stochastic parsing methods. Within this work, methods that have focused on 
the statistics of lexical items have generally outperformed methods that focus 
on the statistics of supra-lexical structural events, such as statistical context 
free grammars (Marcus 1995). The success of these approaches to process- 
ing has expanded the set of computational mechanisms made available to 
psycholinguistics as conceptual tools. Both of these developments have been 
similar in spirit, to CBL thinking. We have attempted to advance the formal 
specification of constraint-based proposals in psycholinguistics by building 
upon the foundation of one lexicalist grammatical formalism, Lexicalized 
Tree- Adjoining Grammar (LTAG, Joshi and Schabes 1996). We have also 
drawn insights from work on statistical techniques for processing over LTAG 
(Srinivas and Joshi 1998). This section introduces LTAG and representa- 
tional and processing issues within it. 

The idea behind LTAG is to localize the computation of linguistic 
structure by associating lexical items with rich descriptions that impose 
complex combinatory constraints in a local context. Each lexical item is as- 
sociated with at least one “elementary tree” structure, which encodes the 
“minimal syntactic environment” of a lexical item. This includes such in- 
formation as head-complement requirements, filler-gap information, tense. 
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and voice. Figure 1 shows some of the elementary trees associated with the 
words of the sentence The police officer believed the victim was lying} The 
trees involved in the correct parse of the sentence are highlighted by boxes. 
Note that the highlighted tree for believed specifies each of the word’s argu- 
ments, a sentential complement and a noun phrase subject. 
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the police officer believed the victim was lying 

Figure 1 : A partial illustration of the elementary tree possibilities for the 
sentence the police officer believed the victim was lying. Trees involved 
in the correct parse of the sentence are highlighted in boxes. 

Encoding combinatory information in the lexicon rather than in supra- 
lexical rules has interesting effects on the nature of structural analysis. One 
effect is that the number of different descriptions for each lexical item be- 
comes much larger than when the descriptions are less complex. For in- 



^he down-arrows and asterisks in the trees mark nodes at which trees make 
contact with each other during the two kinds of combinatory operations of Tree Ad- 
joining Grammar, substitution and adjunction. Down-arrows mark nodes at which the 
subsitution operation occurs, and asterisks mark footnodes, which participate in the 
adjunction operation. The details of the combinatory operations of TAG are beyond 
the scope of this paper. See Joshi and Schabes (1996) for a discussion. 
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stance, the average elementary tree ambiguity for a word in Wall Street Jour- 
nal text is about 47 trees (Srinivas and Joshi 1998). In contrast, part-of- 
speech tags, which provide a much less complex description of words, have 
an ambiguity of about 1.2 tags per word in Wall Street Journal text. Thus, 
lexicalization increases the local ambiguity for the parser, complicating the 
problem of lexical ambiguity resolution. The increased lexical ambiguity is 
partially illustrated in Figure 1, where six out of eight words have multiple 
elementary tree possibilities. The flip-side to this increased lexical ambigu- 
ity, however, is that resolution of lexical ambiguity yields a representation 
that is effectively a parse, drastically reducing the amount of work to be done 
after lexical ambiguity is resolved (Srinivas and Joshi 1998). This is because 
the elementary trees impose such complex combinatory constraints in their 
own local contexts that there are very few ways for the trees to combine once 
they have been correctly chosen. The elementary trees can be understood as 
having ‘compiled out’ what would be rule applications in a context-free 
grammar system, so that once they have been correctly assigned, most syn- 
tactic ambiguity has been resolved. Thus, the lexicalization of grammar 
causes much of the computational work of structural analysis to shift from 
grammatical rule application to lexical ambiguity resolution. We refer to the 
elementary trees of the grammar as supertags, treating them as complex 
analogs to part-of-speech tags. We refer to the process of resolving supertag 
ambiguity as supertagging. One indication that the work of structural analy- 
sis has indeed been shifted into lexical ambiguity resolution is that the run- 
time of the parser is reduced by a factor of thirty when the correct supertags 
for a sentence are selected in advance of parsing.^ 

Importantly for the current work, this change in the nature of parsing has 
been complemented by the recent development of statistical techniques for 
lexical ambiguity resolution. Simple statistical methods for resolving part-of- 
speech ambiguity have been one of the major successes in recent work on 
statistical natural language processing (cf. Church and Mercer 1993, Marcus 
1995). Several algorithms tag part-of-speech with accuracy between 95% 
and 97% (cf. Charniak 1993). Applying such techniques to the words in a 
sentence before parsing can substantially reduce the work of the parser by 
preventing the construction of spurious syntactic analyses. Recently, Srinivas 
and Joshi (1998) have demonstrated that the same techniques can be effec- 
tive in resolving the greater ambiguity of supertags. They implemented a tri- 



^This is based on run-times for a sample of 1300 sentences of Wall Street Jour- 
nal text, reported by Srinivas and Joshi (1998). Running the parser without supertag- 
ging took 120 seconds, while running it with correct supertags pre-assigned took 4 
seconds. 
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gram Hidden Markov Model of supertag disambiguation. When trained on 
200,000 words of parsed Wall Street Journal text, this model produced the 
correct supertag for 90.9% of lexical items in a set of held out testing data. 

Thus, simple statistical techniques for lexical ambiguity resolution can 
be applied to supertags just as they can to part-of-speech ambiguity. Due to 
the highly constraining nature of supertags, these techniques have an even 
greater impact on structural analysis when applied to supertags than when 
applied to part-of-speech tagging. These results provide a demonstration that 
much of the computational work of linguistic analysis, which has tradition- 
ally been understood as the result of structure building operations, might 
instead be seen as lexical disambiguation. This has important implications 
for how psycholinguists are to conceptualize structural analysis. It expands 
the potential role in syntactic analysis of simple pattern recognition mecha- 
nisms for word recognition, which have played a very limited role in classi- 
cal models of human syntactic processing. 

Note that the claim here is not that supertagging accomplishes the entire 
task of structural analysis. After elementary trees have been selected for the 
words in a sentence, there remains the job of connecting the trees via the 
LTAG combinatory operations of adjunction and substitution. The principal 
claim of this section is that in designing a system for syntactic analysis there 
are sound linguistic and engineering reasons for storing large amounts of 
grammatical information in the lexicon and for performing much of the work 
of syntactic analysis with something like supertagging. If such a system is 
also to be used as a psycholinguistic model, it is natural to predict that many 
of the initial processing commitments of syntactic analysis are made by a 
level of processing analogous to supertagging. In the following section, we 
discuss how an LTAG-based supertagging system resolves at the lexical level 
many of the same syntactic ambiguities that have concerned researchers in 
human sentence processing, suggesting that a supertagging system might 
provide a good psycholinguistic model of syntactic processing. Thus, al- 
though the question of how such a system fits into a complete language 
processing system is an important one, it may be useful to begin exploring 
the psychological implications of supertagging in advance of a thorough un- 
derstanding of how to design the rest of the system.'* 



‘*Srinivas (1997) has suggested that this can be done by a process that is simpler 
than full parsing. He calls this process “stapling”. 
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4 A Model of the Grammatical Aspects of Word 
Recognition Using LTAG 

In the remaining sections of this paper, we describe an on-going project 
which attempts to use LTAG to develop a more fully-specified account of the 
CBL theory of human sentence processing. We argue that the notion of su- 
pertagging can become the basis of a model of the grammatical aspects of 
word recognition, provided that certain key adjustments are made to bring it 
in line with the assumptions of psycholinguistic theory (Kim et al., in prepa- 
ration). Before introducing this model, we outline how LTAG can be used to 
advance the formal specification of the CBL theory.^ We then turn to some of 
the findings of the model, which capture some of the major phenomena re- 
ported in the human parsing literature. 

LTAG lexicalizes syntactic information in a way that is highly consistent 
with descriptions of the CBL theory, including the lexicalization of head- 
complement relations, filler-gap information, tense, and voice. The value of 
LTAG as a formal framework for a CBL account can be illustrated by the 
LTAG treatment of several psycholinguistically interesting syntactic ambi- 
guities, e.g., prepositional phrase attachment ambiguity, the NP/S comple- 
ment ambiguity, the reduced relative/main clause ambiguity, and the com- 
pound noun ambiguity. In all but one of these cases, the syntactic ambiguity 
is characterized as stemming from a lexical ambiguity. 

Figure 2 (below) presents the LTAG treatment of these ambiguities. 
Each of the sentence fragments in the figure ends with a syntactically am- 
biguous word and is accompanied by possible supertags for that word. First, 
the prepositional phrase attachment ambiguity is illustrated in Figure 2a. The 
ambiguity lies in the ability of the prepositional phrase with the ... to modify 
either the noun phrase the cop (e.g., with the red hair) or modify the verb 
phrase headed by saw (e.g., with the binoculars). Within LTAG, prepositions 
like with indicate lexically whether they modify a preceding noun phrase or 
verb phrase. This causes prepositional phrase attachment ambiguities to 
hinge on the lexical ambiguity of the preposition. Similarly, the NP/S ambi- 
guity discussed in the Introduction arises directly from the ambiguity be- 
tween the elementary trees shown in Figure 2b. In this case, these trees en- 
code the different complement-taking properties of the \erh forgot (e.g., the 
recipe vs. the recipe was ...) Figure 2c shows a string that could be parsed as 
a Noun-Noun compound (e.g., the warehouse fires were extinguished) or a 



^Of course, formal specification of this theory can be achieved by using other 
lexicalized grammatical frameworks, e.g., LFG (Bresnan and Kaplan 1982), HPSG 
(Pollard and Sag 1994), CCG (Steedman 1996). 
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Subject- Verb sequence (e.g., the warehouse fires older employees.). In non- 
lexicalist grammars, this ambiguity is treated as arising from the major cate- 
gory ambiguity oi fires. In LTAG, this ambiguity involves not only the cate- 
gory ambiguity but also a more fine-grained ambiguity regarding the previ- 
ous noun warehouse. Due to the nature of combinatory operations of LTAQ 
nouns that appear as phrasal heads or phrasal modifiers are assigned different 
types of elementary trees (i.e., the Alpha-ZBeta- distinction in LTAG, see Do- 
ran, Egedy, Hockey, Srinivas and Zaidel 1994). Figure 2d illustrates the re- 
duced relative/main clause ambiguity (e.g., the defendant examined by the 
lawyer was ... vs. the defendant examined the pistol.). Here again, the critical 
features of the phrase structure ambiguity are lexicalized. For instance, the 
position of the gap in an object-extraction relative clause is encoded at the 
verb (right-hand tree in Figure 2d). This is because LTAG trees encode the 
number, type, and position of all verb complements, including those that 
have been extracted. Finally, Figure 2e illustrates a structural ambiguity that 
is not treated lexically in LTAG. As in Figure 2a, the preposition with is as- 
sociated with two elementary trees, specifying verb phrase or noun phrase 
modification. However, in this example, both attachment possibilities in- 
volve the same tree (NP-attachment), which can modify either general or 
secretary. The syntactic information that distinguishes between local and 
non-local attachment is not specified lexically. So, within LTAQ this final 
example is a case of what we might call true attachment ambiguity. This ex- 
ample illustrates the point made earlier that even when a lexical tree is se- 
lected, syntactic processing is not complete, since lexical trees need to be 
combined together through the operations of substitution and adjunction. In 
the first four examples, the selection of lexical trees leaves only a single way 
to combine these items. In the final example, however, multiple combinatory 
possibilities remain even after lexical selection. 

The examples in Figure 2 illustrate the compatibility of LTAG with the 
CBL theory. Both frameworks lexicalize structural ambiguities in similar 
ways, with LTAG providing considerably more linguistic detail. This sug- 
gests that LTAG can be used to provide a more formal statement of the rep- 
resentational claims of the CBL theory. For instance, one can characterize 
the grammatical aspects of word recognition as the parallel activation of pos- 
sible elementary trees. The extent to which a lexical item activates a par- 
ticular elementary tree is determined by the frequency with which it has re- 
quired that tree during an individual’s linguistic experience. The selection of 
a single tree is accomplished through the satisfaction of multiple probabilis- 
tic constraints, including semantic and syntactic contextual cues. The CBL 
theory has traditionally focused on the activation of verb argument structure. 
The introduction of a wide-coverage grammar into this theory generates 
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Figure 2: LTAG treatment of several psycholinguistically interesting 
syntactic ambiguities: (a) PP-attachment ambiguity; (b) NP/S ambigu- 
ity; (c) NA^ category ambiguity; (d) reduced relative/main clause ambi- 
guity; (e) PP-attachment ambiguity with both attachment sites being 
nominal. 
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clear predictions about the grammatical representations of other words. In 
particular, the same ambiguity resolution processes occur for all lexical items 
for which LTAG specifies more than one elementary tree. 

The grammatical predictions of LTAG are worked out in an English 
grammar, which is the product of an ongoing grammar development project 
at the University of Pennsylvania (Doran et al. 1994). The grammar provides 
lexical descriptions for 37,000 words and handles a wide range of syntactic 
phenomena, making it a highly robust system. The supertagging work de- 
scribed in this paper makes critical use of this grammar! The 
comprehensiveness of the grammar makes it a valuable tool for psycholin- 
guistic work, by allowing formal statements about the structural properties of 
a large fragment of the language. In our case, it plays a critical role in our 
attempt to ‘scale up’ CBL models in order to investigate the viability of such 
models on closer approximations to the full language than they have been 
tested on before. 

4.1 Implementation 

In this section, we describe preliminary results of a computational modeling 
project exploring the ability of the CBL theory to integrate the representa- 
tions of LTAG We have been developing a connectionist model of the 
grammatical aspects of word recognition, which attempts to account for 
various psycholinguistic findings pertaining to syntactic ambiguity resolu- 
tion (Kim et al., in prep.). Unlike previous connectionist models within the 
CBL approach (McRae et al. 1998, Tabor et al. 1997, Spivey-Knowlton 
1996, Juliano and Tanenhaus 1994), this model has wide coverage in that it 
has an input vocabulary of 20,000 words and is designed to assign 304 dif- 
ferent LTAG elementary trees to input words. The design of the model was 
not guided by the need to match a specific set of psycholinguistic data. 
Rather, we applied simple learning principles to the acquisition of a wide 
coverage grammar, using as input a corpus of highly-variable, naturally oc- 
curring text. Certain patterns of structural preferences and frequency effects, 
which are characteristic of human data, fall directly out of the model’s sys- 
tem of distributed representation and frequency-based learning. 

The model resembles the statistical supertagging model of Srinivas and 
Joshi 1998, which we briefly described above. We have, however, made key 
changes to bring it more in line with the assumptions behind the CBL 
framework. The critical assumptions are that human language comprehen- 
sion is characterized by distributed, similarity-based representations (cf. Sei- 
denberg 1992) and by incremental processing of a sentence. The Srinivas 
and Joshi model permits the use of information from both left and right con- 
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text in the syntactic analysis of a lexical item (through the use of Viterbi de- 
coding). Furthermore, their model has a ‘perfect’ memory, which stores the 
structural events involving each lexical item separately and without error. In 
contrast, our model processes a sentence incrementally, and its input and 
internal representations are encoded in a distributed fashion. Distributed rep- 
resentations cause each representational unit to play a role in the representa- 
tion of many lexical items, and the degree of similarity among lexical items 
to be reflected in the overlap of their representations. 

These ideas were implemented in a connectionist network, which pro- 
vided a natural framework for implementing a distributed processing sys- 
tem.^ The model takes as input information about the orthographic and se- 
mantic properties of a word and attempts to assign the appropriate supertag 
for the word given the local left context. The architecture of the model con- 
sists of three layers with feed-forward projections, as illustrated in Figure 3 
on the next page. 

The model’s output layer is a 95 unit array of syntactic features which is 
capable of uniquely specifying the properties of 304 different supertags. 
These features completely specify the components of an LTAG elementary 
tree: 1) part-of-speech, 2) type of ‘extraction’, 3) number of complements, 4) 
category of complement, and 5) position of complements. Each of these 
components is encoded with a bank of localist units. For instance, there is a 
separate unit for each of 14 possible parts of speech, and the correct activa- 
tion pattern for a given supertag activates only one of these units (e.g., 
“Noun”). The model was given as input rudimentary orthographic informa- 
tion and fine-grained distributional information about a word. 107 of the 
units encoded orthographic features, such as the 50 most common three- 
letter word-initial segments (e.g., ins), the 50 most common two-letter word- 
final segments (e.g., ed), and seven properties such as capitalization, hy- 
phenation, etc. The remaining 40 input units provide a ‘distributional profile’ 
of each word, which was derived from a co-occurrence analysis. 



®This is not to say that left-to-right processing and overlapping representations 
cannot be incorporated into a symbolic statistical system. However, most attempts 
within psycholinguistics to incorporate these assumptions into a computationally 
explicit model have been made within the connectionist framework (e.g., Elman 
1990, Juliano and Tanenhaus 1994, Seidenberg and McClelland 1989). By using a 
connectionist architecture for the current model, we are following this precedent and 
planning comparisons with existing modeling results. 
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Figure 3: Architecture of the model 

The orthographic encoding scheme served as a surrogate for the output 
of morphological processing, which is not explicitly modeled here but is 
assumed to be providing interactive input to lexico-syntactic processes that 
are modeled. The scheme was chosen primarily for its simplicity — it was 
automatically derived and easily applied to the training and testing corpus, 
without requiring the use of a morphological analyzer. It was expected to 
correlate with the presence of common English morphological features. 

Similarly, the distributional profiles were used as a surrogate for the ac- 
tivation of detailed semantic information during word recognition. Although 
space prevents a detailed discussion, we note that several researchers have 
found that co-occurrence-based distributional profiles provide detailed in- 
formation about the semantic similarity between words (cf. Burgess and 
Lund 1997, Landauer and Dumais 1997, Schiitze 1993). The forty- 
dimensional profiles used here were created by first collecting co-occurrence 
statistics for a set of 20,000 words in a large corpus of newspaper text.^ The 
co-occurrence matrix was compressed by extracting the 40 principal compo- 



^For each of the 20,000 target words, we counted co-occurences with a set of 
600 high frequency “context” words in 14 million words of Associated Press news- 
wire. Co-occurrences were collected in a six-word window around each target word 
(three words to either side of the word). 
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nents of a Singular Value Decomposition (see Kim et al., in preparation, for 
details). An informal inspection of the space reveals that it captures certain 
grammatical and semantic information. Table 1 shows the nearest neighbors 
in the space for some selected words. These are some of the better examples, 
but in general the information in the space consistently encodes semantic 
similarities between words. 



Word 


Nearest Neighbors by Distributional Profile 


scientist 


researcher, scholar, psychologist, chemist 


london 


tokyo, Chicago, atlanta, paris 


literature 


poetry, architecture, drama, ballet 


believed 


feared, suspected, convinced, admitted 


bought 


purchased, loaned, borrowed, deposited 


smashed 


punched, cracked, flipped, slammed 


confident 


hopeful, optimistic, doubtful, skeptical 


certainly 


definitely, obviously, hardly, usually 


From 


with, by, at, on 



Table 1 : Nearest neighbors of sample words based on distributional 
profiles. 

We implemented two architectural variations on the basic architecture 
described above, which gave the model an ability to maintain information 
over time so that its decisions would be context sensitive. The first variation 
expanded the input pattern to provide on each trial a copy of the input pattern 
from the previous time step along with the current input. This allowed the 
network’s decisions about the current input to be guided by information 
about the preceding input. We will call this architecture the two-word input 
model (2W). The second variation provided simple recurrent feedback from 
the output layer to the hidden layer so that on a given trial the hidden layer 
would receive the previous state of the output layer. This again allowed the 
model’s decision on a given trial to be contingent on activity during the pre- 
vious trial. We call this architecture the output-to-hidden architecture (OH). 
For purposes of brevity, we discuss only the results of the 2W architecture. 
In all statistical analyses reported here, the OH architecture produced the 
same effects as the 2W architecture. 

The model was trained on a 195,000 word corpus of Wall Street Journal 
text, which had been annotated with supertags. The annotation was done by 
translating the annotations of a segment of the Penn Treebank (Marcus, 
Santorini and Marcinkiewicz 1993) into LTAG equivalents (Srinivas 1997). 
During training, for each word in the training corpus, the appropriate ortho- 
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graphic units and distributional profile pattern were activated in the input 
layer. The input activation pattern was propagated forward through the hid- 
den layer to the output layer. Learning was driven by back propagation of the 
error between the model’s output pattern and the correct supertag pattern for 
the current word (Rumelhart, Hinton and Williams 1986). 

We tested the overall performance of the model by examining its super- 
tagging accuracy on a 12,000 word subset of the training corpus that was 
held out of training. The network’s syntactic analysis on a given word was 
considered to be the supertag whose desired activation pattern produced the 
lowest error with respect to the model’s actual output (using least squares 
error). On this metric, the model guessed correctly on 72% of these items. 
Using a slightly relaxed metric, the correct supertag was among the model’s 
top three choices (the three supertags with the lowest error) 80% of the time. 
This relaxed metric was used primarily to assess the model’s potential for 
increased overall accuracy in future work, if the correct analysis was highly 
activated even when it was not the most highly activated analysis, then fu- 
ture changes might be expected to increase the model’s overall accuracy 
(e.g., improvements to the quality of the input representation). Accuracy for 
basic part of speech on the relaxed metric was 91%. The performance of the 
network can be compared to 79% accuracy for a ‘greedy’ version of the tri- 
gram model of Srinivas and Joshi (1998), which was trained on the same 
corpus. The greedy version eliminated the previously mentioned ability of 
the original model to be influenced by information from right context in its 
decisions about a given word. 

Although these results indicate that the model acquired a substantial 
amount of grammatical knowledge, the main goal of this work is to examine 
the relationship between the model’s operation and human behavioral pat- 
terns, including the patterns of misanalysis characteristic of human process- 
ing. In pursuing this goal, we measure the model’s degree of commitment to 
a given syntactic analysis by the size of its error to that analysis relative to its 
error to other analyses. We make the linking hypothesis that reading time 
elevations due to misanalysis and revision in situations of local syntactic 
ambiguity should be predicted by the model’s degree of commitment to the 
erroneous syntactic analysis at the point of ambiguity. For example, in the 
NP/S ambiguity of example (1), the model’s degree of commitment to the 
NP-complement analysis over the S-complement analysis should predict the 
amount of reading time elevation at the disambiguating region was in.... 

We conducted experiments on the model that mimic the structure of on- 
line processing experiments. The following section discusses the results of 
two experiments, which investigate the model’s processing of the NP/S am- 
biguity and the noun/verb lexical category ambiguity. 
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4.2 Modeling the NP/S Ambiguity 

One set of behavioral data that our model aims to account for is the pattern 
of processing difficulty around the NP/S ambiguity discussed in section 2 
and exemplified in (1), repeated here as (2). 

(2) a. The chef forgot the recipe was in the back of the book. 

b. The chef claimed the recipe was in the back of the book. 

In (2a), comprehenders can initially treat the noun phrase the recipe as either 
the NP-complement of forgot or the subject of a sentential complement to 
forgot. Numerous experiments have found that readers of locally ambiguous 
sentences like (2a) often erroneously commit to a NP-complement interpre- 
tation (Holmes et al. 1989, Ferreira and Henderson 1990, Trueswell et al. 
1993, Garnsey et al. 1997). 

Several experiments have found that the general processing bias toward 
the NP-complement is modulated by the structural bias of the main verb 
(Trueswell et al. 1993, Garnsey et al. 1997). Erroneous commitments to the 
NP-complement interpretation are weakened or eliminated when the main 
verb has a strong S-bias (e.g., claimed). Recently, Trueswell and Kim (1998) 
have shown similar effects when verb bias information is introduced to proc- 
essing through a lexical priming technique. Thus, the language processing 
system appears to be characterized simultaneously by an overall bias toward 
the NP-complement analysis and by the influence of the lexical preferences 
of S-bias verbs. 

The coexistence of these two conflicting sources of guidance may be 
explained in terms of “neighborhoods of regularity” in the representation of 
verb argument structure (Seidenberg 1992, Juliano and Tanenhaus 1994). 
NP-complement and S-complement verbs occupy a neighborhood of repre- 
sentations, in which the NP-complement pattern dominates the “irregular” S- 
complement pattern, due to greater frequency. The ability of S-complement 
items to be represented accurately is dependent on frequency. High fre- 
quency S-complement items are accurately represented, but low frequency 
S-complement items are overwhelmed by their dominant NP-complement 
neighbors. Juliano and Tanenhaus (1993) found evidence in support of this 
hypothesis in a study in which the ability of verb bias information to guide 
processing was characterized by an interaction between the frequency and 
the subcategory of the main verb. The ability of S-complement verbs to 
guide processing commitments was correlated with the verb’s lexical fre- 
quency. Low frequency S-complement verbs allowed erroneous commit- 
ments to the NP-complement analysis in spite of the verb’s bias, while high 
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frequency S-complement items caused rapid commitments to the correct S- 
complement analysis. 

We examined the model’s processing of NP/S ambiguous sentence 
fragments like (3). Detailed results are reported by Kim et al. (in prep.). 

(3) The economist decided ... 

Twenty-eight verbs were selected on the basis of their frequency properties 
in the model’s training corpus. Half of these strongly tended to take S- 
complements and half strongly tended to take NP-complements. Within each 
verb-bias type, half of the target verbs were high in frequency and half were 
low in frequency. Each NP-biased item was matched in frequency to a S- 
biased item. These verbs were then embedded in a sentence fragment, which 
was presented to the model. Table 2 shows examples of each of the four con- 
ditions that resulted from crossing verb bias with frequency. 



Example 


Frequency 


Structural Bias 


The economist decided 


High 


S-complement 


The economist elected 


High 


NP-complement 


The economist denied 


Low 


S-complement 


The economist achieved 


Low 


NP-complement 



Table 2: Examples of materials used to examine the model’s NP/S 
subcategorization performance. Verb frequency and structural bias were 
determined from the properties of the training corpus. 

The results of the experiment are summarized in Table 3. The model 
clearly recognizes NP/S verbs, as demonstrated by the consistency with 
which it assigned either a NP- or a S-complement supertag to the experi- 
mental items (27 of 28 items). Closer examination of the model’s perform- 
ance reveals major qualities of human comprehension data, including a gen- 
eral bias toward the NP-complement structure, which can be overcome by 
lexical information from high frequency S-complement verbs. As illustrated 
in Table 3, all 14 NP-biased verbs were correctly analyzed, but S-biased 
verbs were misanalyzed on 9 of 14 trials, with 8 of the 9 misanalyses being 
to the NP-complement. The dominance of the NP-complement analysis, 
however, is modulated by the frequency of exposure to S-complement items, 
matching the interaction between frequency and verb subcategory in human 
processing shown by Juliano and Tanenhaus (1994). The model showed high 
accuracy on S-biased verbs when they were high in frequency (5 out of 7 
items were correctly analyzed) but showed a tendency to misanalyze low 
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frequency S-biased items as NP-complement items (all 7 were misanalyzed, 
with 6 of the errors being to the NP-complement). 



Verb Sub- 
category 


Frequency 


S- 

comp 


NP- 

comp 


Other 

Supertags 


Commitment 
to S-comp 


S-comp 


High 


5 


2 


0 


0.013 


S-comp 


Low 


0 


6 


1 


-1.0021 


NP-comp 


High 


0 


7 


0 


-1.1541 


NP-comp 


Low 


0 


7 


0 


-1.3343 



Table 3: The model’s structural analyses of NP/S Verbs. 

We quantified the model’s degree of commitment to the S-complement 
supertag over the NP-complement supertag by subtracting the model’s error 
to the S-complement supertag from its error to the NP-complement supertag 
{NP-complement error - S-complement error)} On this quantification, nega- 
tive values indicate commitment to an NP-complement analysis while posi- 
tive values indicate commitment to the S-complement analysis. This value 
was subjected to an Analysis of Variance with Frequency and Verb Bias as 
factors, which showed an interaction between Frequency and Verb Bias, 
F(l,24) = 7.04; p < 0.05, as well as main effects of Frequency, 
F(l,24)=14.42; p < 0.001 and Verb Bias, F(l,24) = 22.69, p < 0.0001. 



Verb Subcategory 
Tokens 


This 

Model 


Juliano & Tanenhaus 
(1994) 


Penn Treebank 


S-complement 


2708 


1997 


8502 


NP-complement 


10583 


5686 


31935 


Other 


17367 


5368 


89625 




(11436 








auxiliaries) 




All 


30658 


13051 


130062 



Table 4: Frequency properties of various training corpora with respect 
to the NP/S ambiguity. 

The model’s frequency-by-subcategory interaction arises from its system of 
distributed representation and frequency sensitive learning. S-complement 



®Both S-complement and NP-complement verbs come in multiple versions, cor- 
responding to different constructions such as Wh-extraction, passivization, etc. In 
both cases, we computed error with respect to the unextracted, main clause tree. 
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verbs and NP-complement verbs have a substantial overlap in input repre- 
sentation, due to distributional and orthographic similarities {-ed, -ng, etc.) 
between the two types of verbs and the fact that S-complement verbs are 
often NP/S ambiguous. NP-complement tokens dominate S-complement 
tokens in frequency (4 to 1, as shown in Table 4), causing overlapping input 
features to be more frequently associated with the NP-complement output 
than the S-complement output during training. The result is that a portion of 
the input representation of S-complement verbs becomes strongly associated 
with the NP-complement output, causing a tendency for the model to 
misanalyze S-complement items as NP-complement items. The model is able 
to identify non-overlapping input features that distinguish S-complement 
verbs from their dominant neighbors, but its ability to do so is affected by 
frequency. When S-complement verbs are seen in high frequencies, their 
distinguishing features are able to influence connection weights enough to 
allow accurate representation; however, when S-complement verbs are seen 
in low frequencies, their NP-complement-like input features dominate their 
processing. The explanation here is similar to the explanation given by Sei- 
denberg and McClelland (1989) for frequency-by-regularity interactions in 
word naming (e.g., the high frequency irregularity of have vs. the regularity 
of gave, wave, save) and past tense production. 

The theoretical significance of this interaction lies partly in its emer- 
gence in a comprehensive model, which is designed to resolve a wide range 
of syntactic ambiguities over a diverse sample of the language. These results 
provide a verification of conclusions drawn by Juliano and Tanenhaus (1994) 
from a much simpler model, which acquired a similar pattern of knowledge 
about NP-complement and S-complement verbs from co-occurrence infor- 
mation about verbs and the words that follow them. It is important to provide 
such follow-up work for Juliano and Tanenhaus (1994), because their simpli- 
fications of the domain were extreme enough to allow uncertainty about the 
scalability of their results. Although their training materials were drawn from 
naturally occurring text (the Wall Street Journal and Brown corpora), they 
sampled only a subset of the verbs in that text and the words occurring after 
those verbs. S-complement tokens were more common in their corpus than 
in the full language (2.5 times more common than in the full corpus from 
which their training materials were drawn), and only past-tense tokens were 
sampled. This constitutes a substantial simplification of the co-occurrence 
information available in the full language. In our sample of the Wall Street 
Journal corpus, non-auxiliary verbs account for only 10.8% of all tokens, 
suggesting that the full language may contain many co-occurrence events 
that are ‘noise’ with respect to the pattern detected by the Juliano and Ta- 
nenhaus (1994) model. For instance, as they observe, their domain restricts 
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the range of contexts in which the determiner the occurs, obscuring the fact 
that in the full language, the often introduces a subject noun phrase rather 
than an object noun phrase. It is conceivable that the complexity of the full 
language would obscure the pattern of co-occurrences around the NP/S am- 
biguity sufficiently to prevent a scaled up constraint-based model from ac- 
quiring the pattern of knowledge acquired by the Juliano and Tanenhaus 
1994 model. Our results demonstrate that the processing and representational 
assumptions that allow constraint based models to naturally express fre- 
quency-by-regularity interactions are scalable — they continue to emerge 
when the domain is made very complex. 

4.3 Modeling the NounA^erb Lexical Category Ambiguity 

Another set of behavioral data that our model addresses is the pattern of 
reading times around lexical category ambiguities like that of fires in (4). 

(4) a. the warehouse fires burned for days. 

b. the warehouse fires many workers every spring. 

The string warehouse fires can be interpreted as a subject- verb sequence (4a) 
or a compound noun phrase (4b). This syntactic ambiguity is anchored by 
the lexical ambiguity of fires, which can occur as either a noun or a verb. 

Several experiments have shown that readers of sentences like (4a) often 
commit erroneously to a subject-verb interpretation, as indicated by proc- 
essing difficulty at the next word (burned), which is inconsistent with the 
erroneous interpretation and resolves the temporary ambiguity. Corley 
(1998) has shown that information about the category bias of the ambiguous 
word is rapidly employed in the resolution of this ambiguity. When the am- 
biguous word is one that tends statistically to be a verb, readers tend to 
commit erroneously to the subject-verb interpretation, but when the word 
tends to occur as a noun, readers show no evidence of misanalysis. Mac- 
Donald (1993) has found evidence of more subtle factors, including the rela- 
tive frequency with which the preceding noun occupies certain phrase- 
structural positions, the frequency of co-occurrence between the preceding 
noun and ambiguous word, and semantic fit information. Most importantly 
for the current work, MacDonald found that when the ambiguous word was 
preceded by a noun that tended to occur as a phrasal head, readers tended to 
commit to the subject-verb interpretation. However, when the preceding 
noun tended to occur as a noun modifier, readers tended to commit immedi- 
ately to the correct noun-noun compound analysis. 

The overall pattern of data suggests a relatively complex interplay of 
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constraints in the resolution of lexical category ambiguity. Lexically specific 
information appears to be employed very rapidly and processing commit- 
ments appear to be affected by multiple sources of information, including 
subtle cues like the modifier/head likelihood of a preceding noun. 

We examined the ability of the model to resolve lexical category ambi- 
guities by presenting it with strings containing noun/verb ambiguous words, 
as exemplified by (5). 

(5) a. The emergency plans ... 

b. The division plans ... 

The experiment examined the effect of the category bias of the ambiguous 
word and the modifier/head likelihood of the preceding noun. 

Sixty noun/verb ambiguous words were collected from the training cor- 
pus. These words were either biased toward a noun interpretation, biased 
toward a verb interpretation, or equi-biased (20 of each category). The mem- 
bers of the three categories of bias were matched item-wise for overall 
training frequency. 

Eight nouns were selected from the training corpus to occupy the pre- 
ceding noun position of the experimental materials. Four of these were nouns 
that tended to occur as phrasal heads in the corpus (e.g., division), and the 
other four were nouns that tended to occur as noun modifiers in the corpus 
(e.g., emergency). Context nouns were matched pair-wise for overall training 
frequency. 

Experimental items consisted of a determiner, a context noun, and a 
noun/verb ambiguous item. Each of the eight context nouns was paired with 
each of the 60 NW ambiguous items, creating 480 items like those in Table 
5. The complete set of materials are described in Kim et al. (in prep.). 



Example Item 


Context Support 


Lexical Category Bias 


The emergency plans 


Noun 


N-Bias 


The emergency bid 


Noun 


EQ-Bias 


The emergency pay 


Noun 


V-Bias 


The division plans 


Verb 


N-Bias 


The division bid 


Verb 


EQ-Bias 


The division pay 


Verb 


V-Bias 



Table 5: Examples of materials used to examine the model’s resolution 
of the noun/verb category ambiguity. 



The model clearly recognized the target words to be either nouns or 
verbs. Only 16 out of 480 items were assigned a supertag that was neither a 
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noun supertag nor a verb supertag. The model’s resolution of the noun/verb 
ambiguity showed effects of the category bias of the ambiguous word and 
the Head/Modifier likelihood of the preceding noun, both of which have 
been shown in human processing (Corley 1998, MacDonald 1993). The 
model showed strong commitments to the contextually supported category 
for equi-biased words and also for biased words when the context supported 
the dominant sense of the word. The model had difficulty activating the sub- 
ordinate sense of biased word, even when supported by context. This is il- 
lustrated by examining the activation values of the noun and verb part-of- 
speech units separately from the rest of the output layer, as shown in Table 6 
(Column 3). For biased words occurring in contexts that supported the 
word’s dominant category, the contextually supported part-of-speech unit 
had higher activation than the contextually unsupported unit for 159 of 160 
items (80/80 for N-bias word in N-support context and 79/80 for V-bias word 
in V-support context). For equi-biased items, the contextually supported unit 
was more highly active for 130/160 items (68/80 for N-support and 62/80 for 
V-support). However, for biased words occurring in contexts that support the 
subordinate category, the model showed difficulty activating the contextually 
supported unit, with the contextually supported unit showing superior acti- 
vation for only 47 out of 160 items (46/80 for N-support with V-bias and 
11/80 for V-support with N-bias). 



Context Type 


Verb Bias 


Superior Activa- 
tion contextually 
supported unit. 


Degree of Com- 
mitment to Noun 
Interpretation 


N-Support 


N-Bias 


80/80 


0.99 


N-Support 


EQ-Bias 


68/80 


0.82 


N-Support 


V-Bias 


11/80 


0.50 


V-Support 


N-Bias 


47/80 


0.76 


V-Support 


EQ-Bias 


62/80 


0.32 


V-Support 


V-Bias 


79/80 


0.08 



Table 6: The proportion of times that the contextually supported part-of- 
speech unit was given superior activation for noun/verb ambiguous 
words in each of six conditions (column 3) and the model's degree of 
commitment to a Noun analysis (column 4). 

We quantified the model’s degree of commitment to the noun analysis by 
dividing the noun unit activation by the total activation across the noun and 
verb units (Noun-Activation / (Noun-Activation + Verb-Activation)). This is 
summarized in Table 6. The closer this value is to 1.0, the greater the 
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model’s commitment to the noun analysis over the verb analysis, and the 
closer to 0.0, the greater the commitment to a verb analysis. This value was 
subjected to an Analysis of Variance with Context (N-Support, V-Support) 
and Category Bias (N-bias, EQ-bias, V-bias) as factors. The model showed a 
clear effect of lexical category bias, with N-bias items causing a mean noun 
commitment of 0.88, EQ-bias items causing 0.57, and V-bias items causing 
0.29, F(2,57) = 58.23; p < 0.0001. Second, there was an effect of context: in 
the context of N-support nouns, the model tended to commit more strongly 
to noun analyses (mean noun commitment 0.77) than in the context of V- 
support nouns (mean noun commitment 0.39), F(l,57) = 238.01; p < 0.0001. 
Finally, the model showed an interaction between Context and Category- 
Bias with a strong tendency to activate a context-supported pattern for words 
whose bias agreed with the context and for EQ-biased words, but not when 
the category bias disagreed with the context, F(2,57) = 0.0001; p < 0.0001. 

Interestingly, the interaction between word bias and context resembles 
the “subordinate bias” effect observed in the semantic aspects of word rec- 
ognition (Duffy, Morris and Rayner 1988). When semantically ambiguous 
words are encountered in biasing contexts, the effects of context depend on 
the nature of the word’s bias. When the context supports the subordinate 
sense of a biased ambiguous word, processing difficulty occurs. When the 
context supports the dominant sense or when it supports either sense of an 
equi-biased word, no processing difficulty occurs. Our model shows a 
qualitatively identical effect with respect to category ambiguity. We take this 
as further support for the idea, central to lexicalist theories, that lexical and 
syntactic processing obey many of the same processing principles. On the 
basis of this kind of effect in the model, we predict that human comprehend- 
ers should show subordinate bias effects in materials similar to the ones used 
here. Furthermore, because the subordinate bias effects found here are quite 
natural given the model’s system of representation and processing, we would 
expect similar effects to arise in the model and in humans with respect to 
other syntactic ambiguities that are affected by local left context (see 
Trues well 1996, for similar predictions about subordinate bias effects in- 
volving the main clause/relative clause ambiguity). 

The model’s use of fine-grained contextual cues in resolving category 
ambiguities strongly suggests the viability of using such cues to inform syn- 
tactic decisions in human language processing. This goes against suggestions 
in the literature that such fine-grained information is often too sparse to ac- 
curately drive a statistical model of the language (Mitchell et al. 1995, Cor- 
ley and Crocker 1996). We return to this issue in the next section. 
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5 General Discussion 

In this paper, we have attempted to advance the grammatical coverage and 
formal specification of Constraint-based Lexicalist models of language com- 
prehension. A convergence of perspectives between constraint-based theory 
in psycholinguistics and work in theoretical and computational linguistics 
has supported and guided our proposals. We have attempted to give a con- 
crete description of the syntactic aspects of the CBL theory by attributing to 
human lexical knowledge the grammatical properties of a wide coverage 
Lexicalized Tree Adjoining Grammar (Doran et al. 1994). In developing a 
processing model, we have drawn insight from work on processing with 
LTAG which suggests that statistical mechanisms for lexical ambiguity 
resolution may accomplish much of the computation of parsing when applied 
to rich lexical descriptions like those of LTAG (Srinivas and Joshi 1998). We 
have incorporated these ideas about grammar and processing into a psycho- 
logically motivated model of the grammatical aspects of word recognition, 
which is wide in grammatical coverage. 

The model we describe is general in purpose; it acquires mappings be- 
tween a large sample of the lexical items of the language and a large number 
of rich grammatical representations. Its design does not target any particular 
set of syntactic ambiguities or lexical items. Nevertheless, it is able to quali- 
tatively capture subtle patterns of human processing data, such as the fre- 
quency-by-regularity interaction in the NP/S ambiguity (Juliano and Tanen- 
haus 1993) and the use of fine-grained contextual cues in resolving lexical 
category ambiguities (MacDonald 1993). 

The wide range of grammatical constructions faced by the model and 
the diversity of its sample of language include much of the complexity of the 
full language and support the idea that constraint-based models of sentence 
processing are viable, even on a large grammatical scale. The model pro- 
vides an alternative to the positions of Mitchell et al. (1995) and Corley and 
Crocker (1996), which propose statistical processing models with only 
coarse-grained parameters such as part-of-speech tags. Their argument is that 
the sparsity of some statistical data causes the fine-grained parameters of 
constraint-based models to be “difficult to reliably estimate” (Corley and 
Crocker 1996) and that the large number of constraints in constraint-based 
models causes the management of all these constraints to be computationally 
intensive. Such arguments assume that a coarse-grained statistical model is 
more viable and more ‘compact’ than a fine-grained model. 

The issue of whether fine-grained statistical processing is viable may 
hinge on some basic computational assumptions. The observation that the 
sparsity of statistical data affects the performance of statistical processing 
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systems is certainly valid. But there are a number of reasons why this does 
not support arguments against fine-grained statistical processing models. 
First, there is a large class of statistical processing models, including 
connectionist systems like the one used here, that are well suited to the use 
of imperfect cues. For instance, a common strategy employed by statistical 
NLP systems to deal with sparse data is to ‘back off to statistics of a coarser 
grain. This is often done explicitly, as in verb subcategorization methods, 
where decisions are conditionalized on lexical information (individual verbs) 
when the lexical item is common, but are conditionalized on (backed off to) 
basic category information (all verbs), when the lexical item is rare (Collins 
1996). In connectionist systems like ours, statistical back-off is the flip-side 
of the network’s natural tendency to generalize but also to be guided by fine- 
grained cues when those cues are encountered frequently. Fine grained fea- 
tures of a given input pattern are able to influence behavior when they are 
encountered frequently, because they are given repeated opportunities to 
influence connection weights. When such fine-grained features are not en- 
countered often enough, they are overshadowed by coarser-grained input 
features, which are by their very nature more frequent. Systems like our 
model can be seen as discovering back-off points. We argue that systems that 
do such backing off are the appropriate class of system for modeling much of 
sentence processing. As a back-propagation learning system with multiple 
grammatical tasks competing for a limited pool of processing resources, our 
model is essentially built to learn to ignore unreliable cues. 

Thus, the interaction between frequency and subcategory that we have 
discussed emerges naturally in the operation of statistical processing devices 
like the model described here. Fine-grained information about S-complement 
verbs is able to guide processing when it is encountered often enough during 
training to influence connection weights in spite of the dominance of NP- 
complement signals. The ability of Head/Modifier likelihood cues about 
nouns to influence connection weights is similarly explained. 

In general, we view the sparsity of data as an inescapable aspect of the 
task of statistical language processing rather than as a difficulty that a system 
might avoid by retreating to more easily estimable parameters. Even part-of- 
speech tagging models like Corley and Crocker’s (1996) include a lexical 
component, which computes the likelihood of a lexical item given a candi- 
date part-of-speech for that word, and their model is therefore affected by 
sparsity of data for individual words — this is true for any tagger based on the 
dominant Hidden Markov Model framework. Furthermore, as mentioned 
earlier, work in statistical NLP has increasingly indicated that lexical infor- 
mation is too valuable to ignore in spite of the difficulties it may pose. Tech- 
niques that count lexically specific events have generally out-performed 
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techniques that do not, such as statistical context-free grammar parsing sys- 
tems (see Marcus 1995). It seems to us that, given a commitment to statisti- 
cal processing models in general, there is no empirical or principled reason 
to restrict the granularity of statistical parameters to a particular level, such 
as the part-of-speech tags of a given corpus. Within the engineering work on 
part-of- speech tagging, there are a number of different tag-sets, which vary 
in the granularity of their tags for reasons unconnected to psychological re- 
search, so that research does not motivate a psychological commitment to 
any particular level of granularity. Furthermore, the idea that the language 
processing system should be capable of counting statistical events at only a 
single level of granularity seems to be an assumption that is inconsistent with 
much that is known about cognition, such as the ability of the visual proc- 
essing system to combine probabilistic cues from many levels of granularity 
in the recognition of objects. The solution to the data sparsity problem, as 
manifested in humans and in successful engineering systems, is to adopt the 
appropriate learning and processing mechanisms for backing off to more 
reliable statistics when necessary. 

We have argued that the complexities of statistical processing over fine- 
grained lexical information do not warrant the proposal of lexically-blind 
processing mechanisms in human language comprehension. Although the 
complexities may be unfamiliar, they are tractable, and there are large pay- 
offs to dealing with them. An increasingly well-understood class of con- 
straint-satisfaction mechanisms is well suited to recognizing fine-grained 
lexical patterns and also to backing off to coarser-grained cues when fine- 
grained data is sparse. The modeling work described here and research in 
computational linguistics suggests that such mechanisms, when applied to 
the rich lexical representations of lexicalized grammars, can accomplish a 
substantial amount of syntactic analysis. Furthermore, the kind of mecha- 
nism we describe here shows a pattern of processing that strongly resembles 
human processing data, suggesting that such mechanisms are good models of 
human processing of speech and text. 

References 

Boland, J.E., Tanenhaus, M.K., Gamsey, S.M. and Carlson, GN. 1995. Verb argu- 
ment stmcture in parsing and interpretation: Evidence from wh-questions. Jour- 
nal of Memory and Language, 34, 774-806. 

Bresnan, J. and Kaplan, R. 1982. Lexical functional grammar: A formal system of 
grammatical representation. In J. Bresnan (Ed.), Mental Representation of 
Grammatical Relations. Cambridge, MA: MIT Press. 

Burgess, C. and Lund, K. 1997. Modeling parsing constraints with high-dimensional 




ERIC 



102 



KIM, SRINIVAS & TRUESWELL 



context space. Language and Cognitive Processes, 12, 177-210. 

Charniak, E. 1993. Statistical language learning. Cambridge, MA: MIT Press. 

Chomsky, N. 1995. The minimalist program. Cambridge, MA: MIT Press. 

Church, K. and Mercer, R. 1993. Introduction to the special issue on computational 
linguistics using large corpora. Computational Linguistics, 19^ 1-24. 

Collins, M. 1996. A new statistical parser based on bigram lexical dependencies. In 
Proceedings of the 34'’' Annual Meeting of the ACL, Santa Cruz. 

Corley, S. 1998. A Statistical Model of Human Lexical Category Disambiguation. 
Unpublished doctoral dissertation. University of Edinburgh, Edinburgh, UK. 

Corley, S. and Crocker, M.W. 1996. Evidence for a tagging model of human lexical 
category disambiguation. In Proceedings of the 18'*' Annual Conference of the 
Cognitive Science Society. 

Crocker, M.W. 1994. On the nature of the principle-based sentence processor. In C. 
Clifton, L. Frazier and K. Rayner (Eds.), Perspectives on sentence processing. 
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. 

Doran, C., Egedi, D., Hockey, B.A., Srinivas, B. and Zaidel, M. 1994. XTAG system 
- a wide coverage grammar for English. In Proceedings of the 17'*' International 
Conference on Computational Linguistics ( COLING 94) . Kyoto, Japan. 

Duffy, S.A., Morris, R.K. and Rayner, K. 1988. Lexical ambiguity and fixation times 
in reading. Journal of Memory and Language, 27, 429-446. 

Elman, J. 1990. Finding structure in time. Cognitive Science, 14^ 179-211. 

Ferreira, F. and Clifton, C. 1986. The independence of syntactic processing. Journal 
of Memory and Language, 25, 348-368. 

Ferreira, F. and Henderson, J.M. 1990. The use of verb information in syntactic 
parsing: A comparison of evidence from eye movements and word-by-word self- 
paced reading. Journal of Experimental Psychology: Learning, Memory, and 
Cognition, 16, 555-568. 

Ford, M., Bresnan, J. and Kaplan, R.M. 1982. A competence-based theory of syntac- 
tic closure. In J. Bresnan (Ed.), The mental representation of grammatical rela- 
tions. Cambridge, MA: MIT Press. 

Frazier, L. 1995. Constraint satisfaction as a theory of sentence processing. Journal 
of Psycholinguistic Research, 24, 437-468. 

Frazier, L. 1989. Against lexical generation of syntax. In W.D. Marslen-Wilson (Ed.), 
Lexical Representation and Process. Cambridge, MA: MIT Press. 

Frazier, L. 1979. On comprehending sentences: Syntactic parsing strategies. Bloom- 
ington, IN: Indiana University Linguistics Club. 

Gamsey, S.M., Pearlmutter, N.J., Myers, E. and Lotocky, M.A. 1997. The contribu- 
tions of verb bias and plausibility to the comprehension of temporarily ambigu- 
ous sentences. Journal of Memory and Language, 37, 58-93. 

Gibson, E. 1998. Linguistic complexity: Locality of syntactic dependencies. Cogni- 
tion, 68, 1-76. 

Gross, M. 1984. Lexicon-grammar and the syntactic analysis of French. In Proceed- 
ings of the l(f*' International Conference on Computational Linguistics 
(COLING ’84), Stanford, CA. 

Holmes, V. M., Stowe, L. and Cupples, L. 1989. Lexical expectations in parsing 
complement- verb sentences. Journal of Memory and Language, 28, 668-689. 



THE CONVERGENCE OF LEXICALIST PERSPECTIVES 103 



Joshi, A. and Schabes, Y. 1996. Handbook of Formal Languages and Automata. Ber- 
lin: Springer- Verlag. 

Juliano, C. and Tanenhaus, M.K. 1994. A constraint-based lexicalist account of the 
subject/object attachment preference. Journal of Psycholinguistic Research, 23, 
459 - 471. 

Juliano, C. and Tanenhaus, M.K. 1993. Contingent frequency effects in syntactic 
ambiguity resolution. In Proceedings of the 15*'* Annual Conference of the Cog- 
nitive Science Society. Hillsdale, NJ: Erlbaum. 

Jurafsky, D. 1996. A probabilistic model of lexical and syntactic access and disam- 
biguation. Cognitive 5'cience, 20, 137-194. 

Kim, A., B. Srinivas, and J. Trueswell. In preparation. To appear in P. Merlo and S. 
Stevenson (eds.). Sentence processing and the lexicon: Formal, computational 
and experimental perspectives. Philadelphia: John Benjamins. 

Landauer, T.K. and Dumais, S.T. 1997. A solution to Plato’s problem: The latent 
semantic analysis theory of acquisition, induction and representation of knowl- 
edge. Psychological Review, 104, 21 1-240. 

Lewis, R.L. 1993. An architecturally-based theory of human sentence comprehen- 
sion. Unpublished doctoral dissertation, Carnegie Mellon University, Pittsburgh, 
PA. 

Lund, K., Burgess, C. and Atchley, R. 1995. Semantic and associative priming in 
high-dimensional semantic space. In Proceedings of the iT’' Annual Conference 
of the Cognitive Science Society. 

MacDonald, M. 1994. Probabilistic constraints and syntactic ambiguity resolution. 
Language and Cognitive Processes, 9, 157-201. 

MacDonald, M.C. 1993. The interaction of lexical and syntactic ambiguity. Journal 
of Memory and Language, 32^ 692-7 1 5. 

MacDonald, M.C., Pearlmutter, N.J. and Seidenberg, M.S. 1994. Lexical nature of 
syntactic ambiguity resolution. Psychological Review, 101, 676-703. 

Marcus, M.P. 1995. New trends in natural language processing: Statistical natural 
language processing. Proceedings of the National Academy of Science, volume 
92, 10052-10059. 

Marcus, M.P, Santorini, B. and Marcinkiewicz, M.A. 1993. Building a large anno- 
tated corpus of English: The Penn Treebank. Computational Linguistics, 19, 
313-330. 

McRae, K., Spivey-Knowlton, M.J. and Tanenhaus, M.K. 1998. Modeling the influ- 
ence of thematic fit (and other constraints) in on-line sentence comprehension. 
Journal of Memory and Language, 38, 283-312. 

Mitchell, D.C. 1987. Lexical guidance in human parsing: Locus and processing char- 
acteristics. In M. Coltheart (Ed.), Attention and performance Xll: The psychol- 
ogy of reading. Hillsdale, NJ: Lawrence Erlbaum Associates. 

Mitchell, D.C. 1989. Verb-guidance and other lexical effects in parsing. Language 
and Cognitive Processes, 4, 123-154. 

Mitchell, D.C., Cuetos, F., Corley, M.M.B. and Brysbaert, M. 1995. Exposure-based 
models of human parsing. Journal of Psycholinguistic Research, 24, 469-488. 

Perfetti, C.A. 1990. The cooperative language processors: Semantic influences in an 
autonomous syntax. In GB. Flores d’Arcais, D.A. Balota and K. Rayner (Eds.), 



104 



KIM, SRINIVAS & TRUESWELL 



Comprehension processes in reading. Hillsdale, NJ: Erlbaum. 

Pollard, C. and Sag, I. 1994. Head-driven Phrase Structure Grammar. Chicago, IL: 
University of Chicago Press. 

Prince, A. and Smolensky, P. 1997. Optimality: From neural networks to universal 
grammar. Science, 275, 1604-1610. 

Pritchett, B.L. 1992. Grammatical competence and parsing performance. Chicago, 
IL: The University of Chicago Press. 

Rayner, K., Carlson, M. and Frazier, L. 1983. The interaction of syntax and seman- 
tics during sentence processing. Journal of Verbal Learning and Verbal Behav- 
ior, 22, 358-374. 

Rayner, K. and Frazier, L. 1987. Parsing temporarily ambiguous complements. 
Quarterly Journal of Experimental Psychology, 39A, 657-673. 

Rumelhart, D., Hinton, G and Williams, R. 1986. Learning representations by back- 
propagating errors. Nature, 323, 533-536. 

Schiitze, H. 1993. Word space. S. Hanson, J. Cowan, and C. Giles (Eds.), Neural 
Information Processing Systems 5. San Mateo, CA: Morgan Kaufmann. 

Seidenberg, M.S. 1992. Connectionism without tears. In S. Davis (Ed.), 
Connectionism: Theory and Practice. New York, NY: Oxford University Press. 

Seidenberg, M.S. and McClelland, J.L. 1989. A distributed, developmental model of 
word recognition and naming. Psychological Review, 96, 523-568. 

Sleator, D. and Temperley, D. 1991. Parsing English with a link grammar. Technical 
report CMU-CS-9 1-196, Department of Computer Science, Carnegie Mellon 
University. 

Spivey-Knowlton, M.J. 1996. Integration of visual and linguistic information: Hu- 
man data and model simulations. Unpublished doctoral dissertation. University 
of Rochester, Rochester, NY. 

Spivey-Knowlton, M.J. and Sedivy, J. 1995. Resolving attachment ambiguities with 
multiple constraints. Cognition, 55, 227-267. 

Srinivas, B. 1997. Complexity of lexical descriptions and its relevance to partial 
parsing. Unpublished doctoral dissertation. University of Pennsylvania, Phila- 
delphia, PA. 

Srinivas, B. and Joshi, A.K. In press. Supertagging: An approach to almost parsing. 
Accepted for publication in Computational Linguistics. 

Steedman, M. 1996. Surface Structure and Interpretation. Cambridge, MA: MIT 
Press. 

Stevenson, S. 1994. Competition and recency in a hybrid network model of syntactic 
disambiguation. Journal of Psycholinguistic Research, 23, 295-322. 

Tabor, W, Juliano, C. and Tanenhaus, M. 1996. A dynamical system for language 
processing. In Proceedings of the I8l^ Annual Conference of the Cognitive Sci- 
ence Society. 

Trueswell, J. 1996. The role of lexical frequency in syntactic ambiguity resolution. 
Journal of Memory and Language, 35, 566-585. 

Trueswell, J.C. and Kim, A.E. 1998. How to prune a garden-path by nipping it in the 
bud: Fast-priming of verb argument structures. Journal of Memory and Lan- 
guage, 39, 102-123. 

Trueswell, J. and Tanenhaus, M. 1994. Toward a lexicalist framework for constraint- 




110 



THE CONVERGENCE OF LEXICALIST PERSPECTIVES 105 



based syntactic ambiguity resolution. In C. Clifton, K. Rayner and L. Frazier 
(eds.), Perspectives on sentence processing. Hillsdale, NJ: Lawrence Erlbaum 
Associates. 

Trueswell, J., Tanenhaus, M., and Gamsey, S. 1994. Semantic Influences on Parsing: 
Use of Thematic Role Information in Syntactic Ambiguity Resolution. Journal 
of Memory and Language, 33, 285-318. 

Trueswell, J., Tanenhaus, M., and Kello, C. 1993. Verb-specific constraints in sen- 
tence processing: Separating effects of lexical preference from garden-paths. 
Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 



528-553. 



Albert E. Kim 
John C. Trueswell 

Institute for Research in Cognitive Science 

University of Pennsylvania 

3401 Walnut Street, Suite 400A 

Philadelphia, PA 19104 

alkim @ psych, upenn. edu 

trueswel @ psych, upenn. edu 



Bangalore Srinivas 
AT&T Research 
1 80 Park Avenue 
P.O. Box 97 1 
Florham Park, NJ 07932 
srini @ att. research, com 



The University of Pennsylvania Working Papers in Linguistics (PWPL) is an 
occasional series produced by the Penn Linguistics Club, the graduate student 
organization of the Linguistics Department of the University of Pennsylvania. 

Publication in this volume does not preclude submission of papers elsewhere; 
copyright is retained by the authors of the individual papers. 

Volumes of the Working Papers are available for $15, prepaid. Please see our web 
site for additional information. 



The PWPL Series Editors 

Jim Alexander 
Alexis Dimitriadis 
Na-Rae Han 
Elsi Kaiser 
Michelle Minnick Fox 
Christine Moisset 
Alexander Williams 



How to reach the PWPL 

U. Penn Working Papers in Linguistics 
Department of Linguistics 
619 Williams Hall 
University of Pennsylvania 
Philadelphia, PA 19104-6305 

http://www.ling.upenn.edu/papers/pwpl.html 
working-papers @ ling, upenn. edu 







o 



Event Heads and the Distribution of Psych-Roots 

Martha McGinnis 



1 Introduction 

Most syntactic accounts of psychological predicates rely on the notion that 
the arguments within a verb phrase are “equidistant” for purposes of syntactic 
movement. Such a view was straightforward under the original “flat struc- 
ture” approach to VP, in which, for example, the direct and indirect objects 
are both treated as sisters of V. Following extensive work on object asym- 
metries (Baker 1988, Barss & Lasnik 1986, Bresnan & Moshi 1993, Larson 
1988, Marantz 1984, 1993, among others), it is now generally agreed that the 
verb phrase has an internal hierarchical structure. Nevertheless, unlike raising 
from one subject position to another, movement of internal arguments to 
subject position has often been treated as though it cannot be held to a strict 
locality (“shortest move”) condition. Accounts involving nonlocal movement 
of internal arguments have been especially prevalent in the literature on 
causative psych-verbs (PsyCaus verbs).' My ulterior motive here is to estab- 
lish that the syntax of psych-predicates actually supports locality in A- 
movement. The approach sketched below points the way towards overcoming 
a potential stumbling block for theories of A-movement, making it possible 
to maintain the strong hypothesis that all syntactic movement respects local- 
ity. 

2 The T/SM Restriction Without Movement 

As a point of departure I take the analysis proposed by Arad (1998, 1999). 
Arad dispenses with the traditional view that the subject of a PsyCaus predi- 
cate originates structurally below the object (Belletti & Rizzi 1988, Pesetsky 



‘Thanks go to Maya Arad, Heidi Harley, Alec Marantz, Liina Pylkkanen, two 
anonymous reviewers, and the rest of the Lexical Categories reading group at Penn. 
This work was supported by a postdoctoral fellowship from the Social Sciences and 
Humanities Research Council of Canada (756-98-0515). 

'PsyCaus predicates correspond to the preoccupare class of Belletti & Rizzi 
(1988). This term distinguishes them from the non-causative piacere class. The dis- 
tinction is important here, so I avoid Pesetsky’ s (1995) term ObjExp, which groups 
the two together. 
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1995). She proposes instead that the subject of a psych-construction is always 
generated as the highest argument, as in a normal transitive clause, and ar- 
gues that differences between psych-predicates and transitives arise largely 
from differences in the aspectual functional heads associated with this high- 
est argument. This proposal has the advantage that it avoids postulating non- 
local movement of a lower argument past a higher one to the subject position. 
As we will see, however, one generalization that remains to be captured un- 
der such an approach is Pesetsky’s T/SM (Target/Subject Matter) restriction. 
The T/SM restriction is the generalization that a PsyCaus verb cannot have 
both a Causer argument and a Target (Ic) or both a Causer and a Subject 
Matter argument (2c): 

(1) a. Mary was angry at the government . TARGET 

b. The article in The Times angered Mary. CAUSER 

c. * The article in The Times angered Mary at the government . 

(2) a. Bill was frightened of another tornado . SUBJECT MATTER 

b. The distant rumbling frightened Bill. CAUSER 

c. * The distant rumbling frightened Bill of another tornado . 

In this paper I contend that the T/SM restriction falls under a broader 
generalization about causativization. Specifically, I propose that this restric- 
tion arises from a morphological distinction between causatives that deter- 
mine the syntactic category of a predicate, and causatives that are added to a 
predicate that already has a category. Categorial morphology is here equated 
with the “event head” of recent literature on lexical semantics (e.g., Harley 
1995, Kratzer 1996). Marantz (1997) proposes that a verbal event head 
merges syntactically with a category-neutral lexical root to produce a phrasal 
unit; this unit corresponds to what is usually thought of as a “lexical verb.” 
The event head is a functional head that often introduces an external argu- 
ment, as with causative v in (3a). I also adopt Baker’s (1997) view that an 
adjectival predicate can have an external argument, and suggest that this ex- 
ternal argument is the specifier of an adjectival event head a, as in (3b). We 
can call the event heads in (3) root-external, since they are directly outside 
the roots; by contrast, a category-external event head occurs outside another 
event head. In English and Japanese, root-external causatives can be spelled 
out using morphology that is idiosyncratically specified by the root, while 
category-external causatives use unspecified (default) morphology, which is 
affixal in Japanese, but not in English. Following Miyagawa (1998), I assume 
that the default causative morphology in English is the independent 
phonological word make. 
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(3) a. The article angered Mary. b. Mary was angry at the government. 



vP aP 




at the govt. 



I will argue that a predicate containing an Experiencer and a T/SM ar- 
gument must contain an event head. A causative added to such a predicate 
will be category-external, allowing only default causative morphology to be 
used (in English and Japanese). Along with the ill-formed (a) examples in (4) 
and (5), then, we have the well-formed (b) examples. 

(4) a. * The article in The Times angered Mary at the government . 

b. The article in The Times made Mary angry at the government . 

(5) a. * The distant rumbling frightened Bill of another tornado , 
b. The distant rumbling made Bill fear another tornado . 

3 The Different Flavors of v 

There are a number of syntactic differences between normal transitives (6a) 
and PsyCaus predicates (6b), to be discussed in Section 3.1. In accounting for 
these differences, Arad (1998, 1999) argues that the crucial distinction is in 
the way the subject is structurally introduced. Suppose that in both cases the 
subject is generated in the specifier of a light causative verb (v). However, in 
(6a) this verb is eventive, while in (6b), it is stative. 

(6) a. Maria mangia la mela. 

‘Maria is eating the apple.’ 
b. Questo preoccupa Gianni. 

‘This worries Gianni.’ 



^The different flavours of v will be labelled as follows: (eventive, agentive v: 

transitives and unergatives), (stative causative v: PsyCaus verbs), (unaccusa- 
tive v: unaccusatives), and (stative perceptive v: SubjExp verbs). See below for 
more detail. I will also assume an adjectival counterpart to 
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It has been argued in the recent literature that agentive transitive verbs 
are (at least) bipartite, containing a light causative verb and a lexical base. In 
some cases, for example, an adverb like again can modify either the causing 
eventuality or the resulting eventuality (von Stechow 1996). Sometimes the 
causative head is realized by a distinct morpheme (Miyagawa 1994). In Eng- 
lish, main verbs are arguably raised to the position of the causative verb, 
giving the order I gave John t a book, instead of *I John gave a book.^ Ma- 
rantz (1997) gives a further argument that the causative v is a separate syn- 
tactic head, based on a contrast between verbal and nominal uses of the same 
lexical root. Let us go through this argument in some detail, since it intro- 
duces some ideas that will be important later on. 

The facts under consideration are below. Chomsky (1970) argues that 
verbs, such as destroy or grow, share a basic (root) component with their 
“derived” nominalizations, destruction or growth. Now consider the argu- 
ment-taking properties of the roots '^destr- and ^Igrow in their verbal and 
nominal contexts. The verb destroy must be transitive (7a-b), while grow can 
be transitive or unaccusative (7c-d). The usual account of the alternation in 
(7c-d) is that grow is basically unaccusative, but can have a causative ele- 
ment added to it, which introduces an agentive argument. 

(7) a. The army destroyed the city. 

b. * The city destroyed.'* 

c. John grew tomatoes. 

d. Tomatoes grew. 

The noun destruction can take a causative possessor, as shown in (8a), 
but growth cannot (8c). Marantz proposes that a derived nominalization 
places a category-neutral lexical root such as 'Jdestr- or ^Igrow in a nominal 
syntactic context (e.g., sister of D). He locates the crucial distinction between 
the roots ^Idestr- and ^Igrow in their intrinsic semantics; ^grow denotes an 
internally-caused change of state, while ^destr- is not internally caused. Ma- 
rantz proposes that this difference in interpretation is responsible for differ- 
ences in their syntactic distribution. 



’This argument is based on a similar argument for raising to v in Collins (1997). 
Pesetsky (1989), Johnson (1991), and Koizumi (1993) provide more extensive evi- 
dence for verb raising in English. 

“The string (7b) is possible under a ‘middle’ interpretation, which I assume in- 
volves a causative v head, like (7a). See Embick (1997) for discussion. 
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(8) a. the army’s destruction of the city 

b. the city’s destruction 

c. * John’s growth of tomatoes 

d. tomatoes’ growth 

In addition to the differences that arise in the nominalization context, Vgrow 
can either take an agentive subject in the verbal context (9a), or not (9b). 
Marantz argues that the agent is introduced by the causative verb v. VGrow 
cannot take a causative possessor in the derived-nominal context (9c), since 
in this context there is no v to introduce one. Of course, the derived nominal 
without a possessor is fine (9d). 

(9) a. John grows tomatoes. b. Tomatoes grow. 




Vgrow tomatoes 



c. * John’s growth of tomatoes d. tomatoes’ growth 




Vgrow tomatoes 



By contrast, '^destr- can take a causative possessor in the nominal con- 
text. Marantz suggests that this option is available because the causative in- 
terpretation is recoverable from the semantics of the externally-caused root 
(10c). The robustly causative connotations of '^destr- are also responsible for 
the fact that it must occur with agentive v in the verbal context (lOa-b).’ 



’As Noyer & Harley (1997) observe, other verbs that allow the causative posses- 
' sor are not as strongly causative, and thus need not occur with agentive v in the verbal 
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However, the causative possessor can be absent from the nominal context, 
since is absent from this context (lOd). 

(10) a. The army destroyed the city. b. * The city destroyed. 

vP vP 



V Vp 

Vdestr- the city 



the army v’ 

Vdestr- the city 
c. the army’s destruction of the city d. the city’s destruction 




the army’s D' 



D 



'Vp 



Vdestr- the city 




the city’s 



D’ 




Consider the implications of the verbal and nominal facts taken together. 
The nominal counterpart of a causative verb like destroy can have a causative 
possessor, but the nominal counterpart of grow cannot, even though ylgrow 
can occur in a causative context. If the causative element could be added to 
ylgrow in the lexicon, the newly-minted causative should be able to appear in 
a nominal context, allowing an agentive possessor just like the nominalized 
causative destruction. However, if the causative is a v head added in the syn- 
tax, then the full range of facts can be explained, as above. 

In summary, there is considerable evidence that agentive transitives 
contain a light causative verb and a lexical base, which I will assume here is 
a category-neutral root. Pylkkanen (1998) provides a wealth of evidence from 
Finnish that PsyCaus verbs also have a two-part structure. For example, the 
adverb melkein ‘almost’ can modify either the causing eventuality or the re- 
sulting eventuality. Thus, (11a) can either mean that Matti did something or 
had some property that almost caused a state of disgust in Maija (i.e., the 



context. For example. The army’s explosion of the bridge is possible, but also The 
bridge exploded. 
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mental state almost held), or that Matti almost did something or had some 
property that would have caused a state of disgust in Maija (i.e., the causing 
event almost occurred). Moreover, a PsyCaus verb in Finnish has causative 
morphology; compare the causative in (9a) with the noncausative subject- 
experiencer verb in (11b), where reportedly melkein introduces no ambiguity. 
The causative morphology in (11a) is also used with derived agentive verbs 
(11c). 

(11) a. Matti melkein inho-tti Maija-a. 

M.-NOM almost find.disgusting-CAUS.PAST M.-PAR 
‘Matti almost disgusted Maija.’ 

b. Maija melkein inhoa-a Matti-a. 

M.-NOM almost find.disgusting-3SG M.-par 
‘M aija almost found Matti disgusting.’ 

c. Pekka hajo-tti lasi-n. 

P.-NOM break-CAUS.PAST glass-ACC 
‘Pekka broke the glass.’ 

The semantic and morphological facts of Finnish support a bipartite 
structure for PsyCaus predicates. To these facts we can also add the counter- 
part of Marantz’s argument from nominalizations: Chomsky (1970) points 
out that certain psych-predicates resemble predicates like grow, in that they 
can occur with a causative subject in the verbal context (12a), but cannot take 
a causative possessor in the nominal context (13a).® 

(12) a. John angered the children, 
b. The children were angry. 

(13) a. * John’s anger of the children 
b. the children’s anger 

If we adopt Marantz’s approach for these facts as well, we may conclude that 
a causative interpretation cannot be recovered from the root clanger, but a 
causer can be added to this root syntactically, by means of a light verb. Thus 
we have evidence that a light causative verb is present in both agentive tran- 



sit is worth pointing out that the English causative in (12a) can be either stative 
or eventive. The reading of most interest for the purposes of this discussion is the 
stative one, where John may or may not have been doing anything to anger the chil- 
dren — for example, if just the sight of him was enough to make them angry. Statives 
in Finnish are discussed below. 
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sitives and psych-predicates. The structure of (12b), shown in (14b), is not 
exactly parallel to that of the unaccusative ^grow in (9b); we will return to 
this point later. 



( 14) a. John angered the children. 
vP 



John v’ 

V Vp 

caus 

Vangr- the children 
c. * John’s anger of the children 
DP 



b. The children were angry. 
aP 



the children a’ 



^erc ^angr- 



d. the children’s anger 
DP 



John’s the children’s D' 

D D Vanger 

Vangr- the children 



In order to account for various syntactic differences between agentive 
transitives and psych-predicates, Arad (1998, 1999) argues that they involve 
different types of causative verbs, as noted above. Pylkkanen (1998) provides 
evidence from Finnish that psych-predicates involve a stative causative verb, 
rather than the eventive causative used in agentive transitives. The reader is 
referred to Pylkkanen’ s paper for details, but a brief review follows. The ob- 
ject of a PsyCaus verb in Finnish has partitive case, indicating atelicity.’ Psy- 
Caus verbs also demonstrate other stative characteristics — for example, they 
receive a habitual interpretation in the present tense, and resist the progres- 
sive, An agentive transitive verb can occur in the progressive, though its ob- 



’There is also a class of causative psych-verbs that allows an ACC object (i); this 
case-marking pattern corresponds to a non-stative interpretation. Arad gives extensive 
evidence from Italian that some psych-roots can combine with either the eventive or 
the stative causative v, 

(i) Uutiset viha-stu-tti-vat Mikko-a/Mikko-n. 

news.NOM anger-iNCH-CAUs.PAST-3PL M.-par/acc 
‘T he news made Mikko become angry.’ 
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ject then takes on partitive case (15a). A prototypical stative verb in Finnish 
cannot occur in the progressive (15b), nor can a PsyCaus verb (15c). 

(15) a. Mikko on maalaa-ma-ssa talo-a. 

M.-NOM is paint-INF-iNESS house-PAR 
‘Mikko is painting a house.’ 

b. * Pekka on osaa-ma-ssa ranska-a. 

P.-NOM is know-iNF-lNESS French-PAR 
‘Pekka is knowing French.’ 

c. * Kaisa on saali-tta-ma-ssa Matti-a. 

K.-NOM is pity-CAUS-INF-INESS M.-PAR 
‘Kaisa is causing pity in Matti.’ 

These facts provide evidence that psych-roots combine with a stative 
light causative verb, which has different syntactic properties from the even- 
tive light causative verb used in agentive transitives. Arad (1999) argues that 
this difference in causative verb types is partially responsible for the classic 
“psych-effects” as well.* As we will see, Arad’s generalization has certain 
key empirical advantages over other accounts of psych-effects in the litera- 
ture. 



3.1 Psych-Effects 

Belletti & Rizzi (1988; B&R) identify a collection of differences between 
PsyCaus predicates, which have an Experiencer object, and predicates with 
an Experiencer subject (SubjExp predicates), which have the syntax of regu- 
lar transitives. One such difference is the familiar “backward binding” phe- 
nomenon (Akatsuka 1976, Giorgi 1984, Pesetsky 1987). Unexpectedly, the 
object of a PsyCaus verb, such as worry, can bind an anaphor embedded in 
the subject (16a, 16c). The same is not true for other transitives, as shown by 
the contrasting examples in (16b, 16d). Similar facts obtain in Italian, as 
B&R demonstrate. 

(16) a. These rumors about himself worry John more than anything else, 
b. * These rumors about himself describe John better than anything else. 



“More accurately, she proposes that these effects are associated with a stative 
causative verb assigning accusative case in Italian. There is also a class of psycho- 
logical predicates (B&R’s piacere class) with DAT and nom arguments. 



116 



MARTHA MCGINNIS 



c. Each other’s supporters worried Freud and Jung. 

d. * Each other’s supporters telephoned Freud and Jung. 

Two other restrictions on PsyCaus verbs can be seen in (17) and (18). Tran- 
sitive verbs can occur in a construction with a reflexive clitic (17a), and can 
also take an arbitrary pro subject (18a). Clauses with a “derived” subject 
(passives and unaccusatives) are incompatible with both, as illustrated in 
(17b) and (18b). PsyCaus verbs (the preoccupare ‘worry’ class) pattern with 
passives and unaccusatives in this respect, as shown in the (c) examples. 

(17) a. Gianni si e fotografato. 

‘Gianni photographed himself.’ 

b. * Gianni si e stato affidato. 

‘Gianni was entrusted to himself.’ 

c. * Gianni si preoccupa. 

‘Gianni worries himself.’ 

( 1 8) a. pro ti stanno chiamando. 

‘Somebody is calling you.’ 

b. * pro sono arrivati a casa mia. 

‘Somebody arrived at my place.’ 

c. * Evidentemente, in questo paese per anni pro hanno preoccupato 

il governo. 

‘Evidently, in this country people worried the government for 
years.’ 

Psych-predicates have another distinctive property, which can be de- 
scribed in several ways. One way of putting it is as follows (Pylkkanen 
1998). A causativized unaccusative increases in “valency,” permitting an 
additional argument (19), while a causativized psych-predicate does not in- 
crease in valency (20). (20a) is a SubJExp predicate. In its causative counter- 
part (20b), the Experiencer is the object, but the other argument, at John, can 
no longer be expressed. It has been argued (B&R, Pylkkanen 1998) that this 
contrast arises because the added argument in (19b) adds a new semantic 
role, while in (20b) it has the same semantic role as one of the existing argu- 
ments (here, at John). The impossibility of (20b) then follows from the tradi- 
tional assumption that a single semantic role cannot be expressed by two ar- 
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guments of the same verb.’ Pesetsky (1995) takes a different approach to this 
restriction, to which we return below. 



( 19) a. Tomatoes grew. 

b. John CAUS+grew tomatoes. 

(20) a. The children were angry at John. 

b. Mary CAUS+angered the children (*at John). 

B&R’s account of the psych-effects is as follows. By their view, normal 
transitives (including SubjExp verbs) have an underlying external argument, 
while PsyCaus verbs have an unaccusative structure with a derived subject. 
Under this view, the similarities between PsyCaus structures, passives, and 
unaccusatives follow from the presence of a derived subject, and the back- 
ward binding effects are attributed to the base position of the derived subject. 
B&R propose that the subject of a PsyCaus verb originates below the Experi- 
encer object (21a). Thus, they claim, the Experiencer can bind an anaphor 
embedded within the derived subject before it raises to the subject position. 



(21) a. John frightens them. b. They fear John. 



S 




V NP 

frightens t 






t 



fear John 



The base order of the arguments is determined by their theta-roles. B&R 
take the position that the subject of a PsyCaus predicate is a Theme, while the 
object is an Experiencer. These are the same thematic roles they associate 
with SubjExp predicates, which pattern with transitives throughout. B&R 
argue that a Theme is always generated below an Experiencer argument of 
the same verb. When the Experiencer has inherent Case, the Theme raises to 



’Note that the PP “argument” of a SubjExp predicate can be omitted, like an ad- 
junct. I follow Pesetsky in assuming that optional PP arguments of SubjExp predi- 
cates (like be angry) have essentially the same syntactic status as obligatory DP ob- 
jects of SubjExp predicates (like /ear). Thanks to Heidi Harley for raising this point. 
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the subject position, and a PsyCaus predicate results (21a). Otherwise, both 
arguments have structural Case, and the Experiencer is an external argument, 
yielding a SubJExp predicate (21b). As noted above, this approach provides a 
semantic account of (20b); two arguments are said to bear the Theme role, so 
the structure is ill-formed. 

Nevertheless, a number of problems with this account of psych- 
predicates have been pointed out in the literature. For one thing, a Case-based 
explanation of the differences between SubJExp and PsyCaus predicates does 
not explain the causative interpretation of the latter, or the causative mor- 
phology seen in Finnish. For another, movement of the lower Theme past the 
higher Experiencer to the subject position seems to violate relativized mini- 
mality (Rizzi 1990) or “attract closest” (Chomsky 1995).'” There are also 
several key ways in which PsyCaus predicates fail to pattern with passives 
and unaccusatives. For instance, the Experiencer object of a PsyCaus predi- 
cate in Italian has accusative morphological case, just as in a transitive. 
Moreover, the aspectual auxiliary used with a PsyCaus verb is avere ‘have,’ 
as with a transitive verb, while the auxiliary used with unaccusatives is essere 
‘be.’ Pesetsky (1995) proposes an account that undertakes to explain both the 
differences and the similarities between transitives and PsyCaus predicates. 
The next subsection briefly summarizes the part of this account that is con- 
sistent with Arad’s ‘flavors of v’ approach, adopted here. The remainder of 
the section concerns the remainder of Pesetsky’ s account, to which this paper 
proposes an alternative. 

3.2 Towards Locality-Compliance 

Pesetsky (1995) takes the first steps towards the view that the derivation of 
PsyCaus predicates respects locality. He argues that PsyCaus predicates actu- 
ally do have an external argument, namely the Causer. This argument has a 
different semantic role from the object of a SubJExp verb, which Pesetsky 
calls the Target or Subject Matter. The differences in interpretation can be 
seen in (22) and (23). In (22a), the article is the Target of Bill's anger; for 
example, he might be angry because it panned his new book. However, (22a) 
could not be used to describe a situation in which Bill found the article itself 
irreproachable, but its contents caused him to be angry at the government. 
(22b), on the other hand, could be used to describe such a situation: “The 



'“a lower argument can A-move past a higher one under certain circumstances 
(McGinnis 1998a), but such movement has consequences for binding that seem not to 
arise with PsyCaus verbs, as we will see below. 
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article does cause Bill to be angry, and possibly angry at someone or some- 
thing, but he is not necessarily angry at the article itself’ (Pesetsky 1995: 56). 

(22) a. Bill was very angry at the article in The Times . 

b. The article in The Times angered/enraged Bill. 

Similarly, in (23a), the television set is the Subject Matter of John's worry — 
for example, he might be worried because it is in a precarious position. This 
sentence could not be used to describe a neurotic situation in which John ex- 
perienced an ill-defined anxiety about his life in general whenever he saw or 
thought about the television set. Such a reading is possible in (23b), where 
“the television set causes John to experience worry, but the Subject Matter of 
his thoughts while experiencing worry could have nothing to do with the 
television set” (Pesetsky 1995: 57). 

(23) a. John worried about the television set , 
b. The television set worried John. 

If PsyCaus predicates have a Causer external argument, then their differ- 
ences from normal transitives cannot follow from the absence of such an 
argument. Indeed, Pesetsky shows that one psych-effect, found in PsyCaus 
passives, can be attributed to the stative/eventive distinction between Psy- 
Caus predicates and normal transitives. B&R note that PsyCaus verbs allow a 
passive use. Since verbal passivization would be incompatible with the unac- 
cusative analysis, they propose that this passive is adjectival. Unlike eventive 
verbal passives (24a), and like clearly adjectival passives (24b), passives of 
PsyCaus verbs cannot occur in the progressive (24c) (Grimshaw 1991). 
However, Pesetsky points out that stative passives in general disallow the 
progressive. This generalization includes passives of SubjExp verbs, which 
have an external argument (24d). 

(24) a. The city is being destroyed by the soldiers. 

b. * The book was being very abridged. 

c. * Mary was being depressed by the situation. 

d. * This performance is being liked by Bill. 

Pesetsky shows that backward binding also fails to support the unaccu- 
sative analysis, since this effect arises even when the subject originates above 
the object. As we saw above, unlike eventive transitives (25a), PsyCaus verbs 
(25b) allow backward binding. However, the same effects obtain if a causa- 
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live verb like make is used with a SubjExp complement (25c-d). Here the 
subject clearly originates in a higher position than the Experiencer argument, 
yet backward binding is possible." 

(25) a. * Each other’s supporters telephoned Freud andJung. 

b. Each other’s supporters worried Freud and Jung. 

c. Each other’s supporters made [Freud and Jung angry]. 

d. Each other’s supporters made [Freud and Jung seem [r to 
be angry]]. 

The unavailability of the reflexive clitic derivation also fails to support 
the derived subject analysis. B&R propose that (26) is ill-formed because it 
involves movement of Gianni from below si to above si. 

(26) * Gianni si preoccupa t. 

‘Gianni worries himself.’ 

This derivation is said to be ungrammatical because of a chain formation 
algorithm that prevents an anaphor from occurring in an intervening position 
in the chain between an argument and its trace — see (27), where left-to-right 
order represents c-command (Rizzi 1986). 

(27) * [NP;...anaphor;...r,.] 

As Pesetsky notes, this condition cannot apply as stated, since there is con- 
siderable evidence that the well-formed derivation of a transitive clause with 
si does involve the configuration in (27), with the surface subject raising 
from the object position, as in a passive (Marantz 1984, Kayne 1936). In 

(28) , the reflexive clitic is actually the external argument, but it fails to be- 
come the syntactic subject, at least in part because it lacks Case (McGinnis 
1998a). 

(28) a. Gianni si guarda t. 

‘Gianni watches himself.’ 
b. Gianni si teme t. 

‘Gianni fears himself.’ 



"Pesetsky demonstrates that another psych-effect, the impossibility of an arbi- 
trary pro subject, arises from semantic restrictions that cross-cut the unaccusa- 
tive/transitive distinction. 
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In providing an account of the passive-like derivation of (24b), Marantz 
(1984) raises the question of why this account should be necessary: why is it 
impossible to generate si as an accusative object clitic, and Gianni as the ex- 
ternal argument? The derivation in (28b) is actually forced by a condition 
very like Rizzi’s chain formation algorithm (McGinnis 1998a, 1998b; cf. 
Snyder 1992). This condition is stated in (29). 

(29) Lethal Ambiguity: An anaphoric dependency cannot be established be- 
tween two specifiers of the same head. 



Under the account of Case-checking in Chomsky (1995), the object of a tran- 
sitive clause checks Case on v. If the object is a clitic, it checks Case overtly, 
in a specifier of vP (30a). The external argument is base-generated in a speci- 
fier of vP. As a result, a reflexive clitic object would always violate Lethal 
Ambiguity, since the anaphor and its binder would occupy specifiers of the 
same head at one point in the derivation. Thus the only available derivation is 
the one in which the reflexive clitic is a Caseless external argument, allowing 
the passive-like derivation (30b). 



(30) 




Gianni, T' 






Kayne suggests the descriptive generalization that the (Caseless) reflex- 
ive si is always an external argument. Given the view adopted here — that the 
Causer of a PsyCaus predicate is an external argument too — we must be 
more specific, and say that reflexive si can be generated only in the specifier 
of certain light verbs. One such verb is the eventive causative v, as in (28a). 
Another would be the stative, non-causative v used with SubjExp verbs, as in 
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(28b). However, as we have seen, Caseless si cannot appear with the stative, 
causative v (26), or in passives and unaccusatives (as shown in (17)). 

In summary, Pesetsky’s arguments largely undercut the motivation for a 
locality- violating account of PsyCaus predicates. He shows that many of the 
psych-effects attributed to the unaccusative derivation properly belong to 
other generalizations. Because he treats the Causer subject of a PsyCaus 
predicate as semantically distinct from the T/SM object of a SubjExp predi- 
cate, it should in principle be possible to generate all Causers above Experi- 
encers, and all T/SM arguments below Experiencers. This is essentially the 
approach of Arad (1998, 1999). However, Pesetsky notes that such an ap- 
proach leaves an important generalization unexplained, namely the T/SM 
restriction. In what follows, I will review the T/SM restriction and Pesetsky’s 
account of it, in preparation for the alternative account to be proposed here. 

3.3 The T/SM Restriction 

Under Pesetsky’s account of PsyCaus and SubjExp predicates, the former 
involve a Causer and an Experiencer, while the latter involve an Experiencer, 
and possibly a Target or Subject Matter argument. Pesetsky’s claim that the 
Causer and the T/SM theta-roles are distinct raises the question of why the 
two cannot co-occur, as shown in (1) and (2), repeated below. This co- 
occurrence restriction is what Pesetsky calls the T/SM restriction. 

(1) a. Mary was angry at the government . 

b. The article in The Times angered Mary. 

c. * The article in The Times angered Mary at the government . 

(2) a. Bill was frightened of another tornado , 
b. The distant rumbling frightened Bill. 

In accounting for the T/SM restriction, Pesetsky proposes that the Causer 
of a PsyCaus predicate is actually a derived external argument. The Causer 
originates below the Experiencer, like a T/SM argument, as the object of a 



'^Arad (1997) notes that si is also possible with B&R’s piacere class, which is 
usually treated as an ObjExp class because it has a dative Experiencer. However, Alec 
Marantz (class notes, 1999) suggests that the piacere class may have a SubjExp deri- 
vation, with a quirky dative Experiencer subject. If so, the possibility of si with these 
verbs can be attributed to the presence of the stative noncausative SubjExp v, as in 
(24c), except that here this v is also responsible for quirky dative case on the Experi- 
encer. 
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causative preposition CAUS. It then raises to a theta-position (also Causer) 
above the Experiencer. CAUS is affixal, and must attach to the verb syntacti- 
cally (31). 

(31) John angered the children. 




This proposal yields one possible account of the T/SM restriction. Suppose 
the T/SM argument receives its Case and theta-role from a preposition that 
intervenes between the main verb and the CAUS preposition, as shown in 
(28). If this preposition is not affixal, and cannot raise to V, it will block 
movement of CAUS to V. In accordance with locality, CAUS cannot skip 
over the preposition to V, so the derivation is ill-formed (32). 

(32) * John angered the children at Mary. 
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Pesetsky argues that the possibility of backward binding with PsyCaus 
predicates arises from movement of the Causer from a position c- 
commanded by the Experiencer to a positio c-commanding it. However, as 
noted above, backward binding also occurs in periphrasic psychological cau- 
satives, when there is no such movement. In these cases, Pesetsky suggests, 
backward binding is licensed by semantic identity between the external ar- 
gument and the object of CAUS. Here, however, the CAUS-PP, including the 
lower Causer, can be freely deleted, since they add nothing to the causative 
interpretation of make (33). Deletion of the CAUS-PP makes it possible to 
have a T/SM argument as well as a Causer argument in these cases (34). 

(33) The article made Mary angry at Clinton. 



VP 




/ CAUS the article i 
' -§><• ^ 

(34) a. The article in The Times made Mary angry at the government , 
b. The distant rumbling made Bill frightened of another tornado . 

As Pesetsky notes, this movement (or movement-like) theory of back- 
ward binding effects seems more principled than the descriptive generaliza- 
tion in (35). 

(35) A Causer argument of a predicate n may behave as if c-commanded by 
an argumental DP governed by n. (Pesetsky 1995:49) 
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However, there is reason to doubt that the movement account of (35) can be 
maintained. Note that Pesetsky’s account of the backward binding effects 
assumes that a PsyCaus predicate such as (31) contains no higher causative 
verb. The CAUS-W alone is said to be responsible for the causative inter- 
pretation of such predicates, so it cannot be deleted in (32). However, as we 
saw in Section 2, there is evidence that PsyCaus predicates do contain a light 
causative verb. Thus the contrast between periphrastic causatives and Psy- 
Caus predicates remains unexplained. Arguably, then, the movement account 
does not improve empirically on the descriptive generalization in (35). 

Another problem posed by the movement account is that the Causer 
violates the locality condition on syntactic movement. Although the proposed 
derivation of (31) involves an unusual kind of movement, namely movement 
from one theta-position to another, we would still expect it to respect locality. 
That is, we would expect only the higher argument, the Experiencer, to be 
able to move to the higher Causer theta-position. Such a derivation might fail 
for Case reasons: the Experiencer would be unable to move to the higher 
Causer position because it has already checked (accusative) Case. This deri- 
vation would then be parallel to the ill-formed “superraising” derivation (36), 
in which neither of the arguments in the embedded clause can raise to the 
subject position of the matrix clause. Movement of the higher argument it is 
blocked because this argument has already checked Case. Movement of the 
lower argument is blocked because of locality, since the it is closer to the 
matrix subject position. 

(36) * [ seems [(that) it was told John [that time was up]]]. 

Alternatively, we might suppose that the Experiencer can successfully move 
to the higher Causer position, but that this derivation converges as gibberish, 
given that the same argument has two theta-roles, and a single theta-role 
(Causer) is shared by two arguments. 

Instead, the derivation in (31) has a lower argument skipping over the 
higher one to the subject position. Movement of a lower argument past a 
higher one is in principle compatible with locality, but only if the lower ar- 
gument “leapfrogs” over the higher one. Let us assume, for concreteness, that 
an argument XP can leapfrog over a higher argument YP only if it first 
moves to a position where XP and YP occupy specifiers of the same head 
(Ura 1996), as shown in (37). The two specifiers are then “equidistant” for 
the purposes of locality. As noted above, however, an anaphoric dependency 
cannot obtain between specifiers of the same head. 
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In Japanese, for example, an object can undergo A-movement to a posi- 
tion where it c-commands the subject. From this position, it can bind into the 
subject (38a), but cannot bind the subject directly (38b). A similar situation 
arises if a direct object scrambles past an indirect object to a position above 
the subject. The scrambled direct object can bind an anaphor embedded in the 
indirect object (39a). However, since it must leapfrog over the indirect object 
in order to move to its final scrambled position, the direct object cannot bind 
the indirect object directly (39b). The observations in (38)-(39) are from 
Yatsushiro (1997 and p.c.) 

(38) a. Hiroshi-o [^arezijm-no hahaoya]-ga [t nagutta]. 

H.-ACC self-GEN mother-NOM hit.PST 

‘//w. mother hit Hiroshi,.’ 
b. * Hiroshi-o karezisin-ga [t nagutta]. 

H.-ACC self-NOM hit.PST 

'Himself, hit Hiroshi,.’ 

(39) a. Osamu-o Kazuko-ga[f [karezisin-no hahaoya-m][t miseta]]. 

O.-ACC K.-NOM self-GEN mother-DAT showed 

‘Kazuko showed Osamu, to his. mother.’ 
b. * Osamu-o Kazuko-ga (kagami-o tukatte) [t karezisin-ni [t 
miseta]]. 

O.-ACC K.-NOM mirror-ACC using self-DAT showed 
‘Kazuko showed Osamu, to himself (using a mirror).’ 

However, the subject of a PsyCaus predicate can bind the object, sug- 
gesting that the two never occupy specifiers of the same head (40). These 
examples appear to be acceptable on both an eventive agentive reading and a 
stative PsyCaus reading. Thus, if the Causer subject were td originate below 
the Experiencer, it could only move to the subject position by skipping over 
the intervening argument. Such a derivation would violate locality: the Expe- 
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riencer should block movement of the Causer to the external argument posi- 
tion. 

(40) a. John frightens himself. 

b. Taroo-ga zibunzisin-o odorok-asi-ta. 

T.-NOM self-ACC surprise-CAUS-PAST 
‘Taroo surprised himself. ’ 

In their discussion of reflexive clitics and PsyCaus verbs, B&R note that 
binding is much improved with non-clitic anaphors. They propose that such 
anaphors can receive a “focal” interpretation, and that focused anaphors are, 
in effect, immune to the effects of Lethal Ambiguity (29) (or, equivalently 
here, Rizzi’s chain formation algorithm). However, this account does not 
explain why the (b) examples of (38)-(39) are ill-formed.’’ 

Although there may be some way to make the movement account of 
PsyCaus predicates consistent with the above observations, I take these ob- 
servations as reasonable grounds for seeking an alternative. The ‘flavors of 
little v’ approach adopted here captures many of the same facts as Pesetsky’s 
account, though so far it offers no explanation of the T/SM restriction. The 
remainder of this paper is devoted to an account of the T/SM restriction that 
does not appeal to movement of the Causer from a position below the Expe- 
riencer. 

4 Root-External and Category-External Causatives 

As mentioned in the previous section, psychological causatives with make 

(41) and PsyCaus verbs (42) differ with regard to the T/SM restriction: 

(41) a. * The article in The Times angered Mary at the government , 
b. * The distant rumbling frightened Bill of another tornado . 

(42) a. The article in The Times made Mary angry at the government , 
b. The distant rumbling made Bill frightened of another tornado . 



’’This said, there are apparently some cases in which the Experiencer cannot be 
bound by the Causer. For example, consider (i)-(ii) from Finnish (Liina Pylkkanen, 
p.c.). At the moment I have no explanation for such cases. 

(i)??Pekka inho-tta-a itseaan. (ii)?? Pekka sure-tta-a itseaan. 

Pekka disgust-CAUS-3SG self.PART Pekka be.sad-CAUS-3sG self.PART 

‘Pekka disgusts himself.’ ‘Pekka makes himself sad.’ 
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I will argue here that this contrast arises from the distinction between root- 
external and category-external causatives. I assume that a verb consists of (at 
least) a category-neutral root plus a root-external (category-determining) 
event head, v. The proposal here will be that the examples in (41) involve just 
a root-external causative v, while those in (42) involve a root-external v plus 
a category-external causative v. Root-external causatives have sometimes 
been called “monoclausal,” and category-external causatives “biclausal” 
(Harley 1995).'“ 

These two types of causatives have different semantic and morphologi- 
cal properties (Miyagawa 1980, 1989, 1994, 1998, etc.; Marantz 1997). First 
of all, the interpretation of root-external causatives is usually described as 
involving a more “manipulative” notion of causation than that of category- 
external causatives. Moreover, idioms can include a single causative v, but 
cannot cross the v boundary (Marantz 1997). Thus there are idioms based on 
a root-external causative, but no category-external causative idioms, in which 
both causative v heads are necessary to form the idiom. (43a) is a root- 
external causative idiom, with only a single v head; the noncausative coun- 
terpart has no idiomatic interpretation (43b) (Miyagawa 1980). 

(43) a. Taroo-ga zisyoku-o niow-ase-ta. 

T.-nom resignation-ACC smell-CAUS-PAST 
‘Taro hinted that he might resign.’ 

(lit. ‘Taro caused resignation to smell.’) 
b. Zisyoku-ga nio-u. 

resignation-NOM smell-PRES 
‘Resignation smells; *Resignation is hinted.’ 

Looking at French and English, Ruwet (1991) points out that a causative can 
only be idiomatic if the lower verb is non-agentive. Thus, for example, make 
ends meet is a possible idiom, because meet does not have an agentive 
meaning. By hypothesis, this is a root-external causative, with only a single 
event head. A category-external causative, like make X eat cake, can appar- 
ently never have an idiomatic reading that is absent when the higher causa- 
tive is removed. 

In some cases, the two types of causatives can be distinguished mor- 
phologically. In English and Japanese, for example, the morphology used for 



'“Miyagawa (1998) suggests that biclausal causatives actually involve two Tense 
heads as well as two v heads. I leave this issue for further investigation. 
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root-external causative v is idiosyncratic, varying as a function of the choice 
of lexical root, while such variation is not observed in category-external cau- 
satives. For example, consider the Japanese verbs in (44) (taken from Jacob- 
sen 1992). These verbs illustrate a causative/inchoative alternation, in which 
the event head v is associated with overt morphology (Harley 1995, Nishi- 
yama 1998). On the left are unaccusative verbs, whose event head is noncau- 
sative, and does not introduce an external argument. On the right are transi- 
tive verbs, whose causative event head generally does introduce an external 
argument. The causative morphology here varies idiosyncratically with the 
lexical root. 



(44) a. 


ag-ar-u 


‘rise’ 


ag-e-ru 


‘raise’ 


b. 


hazu-re-ru 


‘come off’ 


hasu-s-u 


‘take off 


c. 


kog-e-ru 


‘become scorched’ 


kog-as-u 


‘scorch’ 


d. 


nar-0-u 


‘ringing 


nar-as-u 


‘ring; 


e. 


ak-0-u 




ak-e-ru 


‘open; 


f. 


kir-e-ru 


‘be cut’ 


kir-0-u 


‘cut’ 



By contrast, for a category-external causative, the regular suffix -(s)ase 
used.'^ Following Miyagawa (1998), I assume that -(s)ase spells out a causa- 
tive event head (v). (45) illustrates cases in which two causative v heads at- 
tach to the category-neutral root. In (45a), the root-external causative is real- 
ized as -(s)as\ in (45b), it is pronounced -(s)ase\ in (45c), it is phonologically 
empty. In each case, the category-external causative is morphologically real- 
ized as -{s)ase\ idiosyncratic causative morphology cannot be inserted out- 
side causative v. 

(45) a. Taroo-ga Hanako-ni kodomo-tati-o ugok-as-ase-ta. 

T.-NOM H.-dat kids-ACC move-CAUS-CAUS-PAST 
Taro made Hanako cause the kids to move.’ 

b. Hanako-gaTaroo-niZiroo-o Mitiko-ni aw-ase-sase-ru. 

H.-NOM T.-DAT Z.-ACC M.-DAT meet-CAUS-CAUS-PRES 
‘Hanako will cause Taroo to make Jiro meet Michiko.’ 

c. Hanako-ga Taroo-ni piza-o tabe-0-sase-ta. 

H.-NOM T.-DAT pizza-ACC eat-CAUS-CAUS-PAST 
‘Hanako made Taro eat pizza.’ 



'^Another causative, -(s)as, can also be used in such contexts. This causative has 
a slightly different interpretation from •(s)ase. 
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In the next section, I argue that not just causative v, but any category 
head, will prevent the insertion of idiosyncratic causative morphology in 
Japanese. Apparently, certain morphological items (or classes of items) are 
restricted to the local domain of the lexical root. Our account of the T/SM 
restriction will depend in part on this observation. 

5 The Internal Structure of Psych-Predicates 

Before tackling the T/SM restriction, let us begin with a clear notion of the 
syntax of a PsyCaus verb. Suppose the structure is as in (46a), with the root 
taking an argument (the Experiencer) and merging with the stative causative 
V. We can compare this with the structure for a category-external causative 
added to a psychological predicate, as shown in (46b). Here the root merges 
with noncausative stative v, yielding a SubJExp verb whose T/SM argument 
checks structural Case (here, covertly) on v. In English and Italian, this Case 
is realized by accusative case morphology, in Finnish by partitive case mor- 
phology. The SubJExp structure then merges with a causative stative v real- 
ized as make. Finally, consider a category-external psychological causative, 
in which the SubJExp component is an adjectival predicate rather than a ver- 
bal one (47). Here I will assume that the root combines with an adjectival 
stative event head a, again yielding a SubJExp predicate. The adjectival event 
head does not check structural Case, so if the predicate has a T/SM argument, 
this argument must be Case-marked by a preposition (here, of). 

(46) a. The rumblings frightened Bill. 



vP 




the rumblings v’ 




Vfright 



Bill 
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b. The rumblings made Bill fear another thunderstorm. 

vP 



the rumblings v’ 



V 

caus 

make 



vP 



Bill 



perc 



Wp 



Vfear another 

thunderstorm 



(47) The rumblings made Bill afraid of another thunderstorm. 



vP 




Vafraid of another 
thunderstorm 

Suppose that the T/SM argument can occur only in the presence of a 
stative, noncausative event head. Derived nominalizations provide evidence 
for this claim. In the English derived nominalization of a psych-root, the 
T/SM argument can only appear as a postnominal PP, not as a pronominal 
possessor (cf. Pesetsky 1995). For example, a Subject Matter PP is fine in 
(48a), but as a possessor it is out (48b).'® (48c) has a reading where Bela 
Lugosi is the Experiencer of fear, but not one where he is Just the Subject 



'thanks to Alec Marantz for suggesting this argument, as well as for pointing 
out that the ill-formedness of examples like (47b) could also be attributed to the fact 
that the T/SM argument is not an “affected” entity (see Anderson 1983). 
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Matter of fear experienced by someone else. Similarly, in (49a), a Target PP 
is fine, but the possessive DP cannot be interpreted as a Target. In (49b), a 
reading is possible in which anger characterizes the contents of the article, 
meaning something like “the article’s ferocity”, but not where the article is 
simply the Target of anger. In (49c), Bill can be the Experiencer of anger, but 
not just the Target of anger experienced by someone else. 

(48) a. Bill’s fear of thunderstorms / of Bela Lugosi 

b. * thunderstorms’ fear 

c. Bela Lugosi’s fear 

(49) a. Hillary ’ s anger at the article / at Bill 

b. ? the article’s anger 

c. Bill’s anger 

Marantz (1997) argues that the semantic role of the possessor of a de- 
rived nominalization must be semantically recoverable from the lexical root. 
As we saw, an argument of the root can be a possessor. The possessor in 
(50a) corresponds to the object of the transitive verb destroy, while the pos- 
sessor in (50b) corresponds to the subject of unaccusative grow, or the object 
of transitive grow. 

(50) a. the city ’ s destruction 
b. tomatoes’ growth 

Suppose that arguments of the root are always semantically recoverable from 
the root (although other arguments may also be recoverable, such as the 
causative argument in the army's destruction of the city). If so, then the 
T/SM argument is not an argument of the psych-root. Rather, it can only be 
semantically licensed by certain functional heads, including noncausative 
stative heads forming nouns, adjectives and verbs. This view is in keeping 
with Pylkkanen’s (1998) proposal that an event head can have the semantics 
of a light perception verb, which takes two arguments, the Experiencer and 
the Percept (here, the T/SM argument).” Let us suppose that this functional 
“perception” predicate can be verbal or adjectival, permitting two arguments 
in both verbal and adjectival SubjExp predicates. I assume that a T/SM ar- 



”Pylkkanen actually argues that it is the causative event head used with PsyCaus 
verbs that has the semantics of a light perception verb, not the event head used with 
SubjExp verbs. 
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gument in a derived nominalization is also licensed by a functional head, the 
nominalizing head (n). 

Note that in the usual case, we have assumed that a head assigns a theta- 
role to its sister or its specifier. In the structures given above, however, the 
event head assigns its T/SM role downwards, to the sister of the root. I adopt 
this structure because the T/SM argument can apparently check structural 
Case on a verbal event head (e.g., in (46b)), just as in a regular transitive. 
Assuming structural Case-checking always involves a relation between an 
NP and a higher functional head, the event head is above the T/SM argument. 
Moreover, the verb-T/SM word order in English SubJExp predicates suggests 
that the T/SM argument is below the event head, since although a root may 
raise overtly to v in English, it generally does not raise to a higher functional 
head (such as T).‘® 

Let us review the key claims. The T/SM argument is not an argument of 
the root. It must be licensed by particular event heads, which generally have 
the semantics of a light perception verb. Suppose, then, that a causative event 
head does not itself have the relevant semantics to license a T/SM argument. 
If so, then the only way to combine the causative meaning with a T/SM ar- 
gument is to generate a category-external causative, with a lower perception 
event head in addition to the higher causative event head (see Section 5.1). 
However, the idiosycratic causative morphology specified by the root is not 
used to spell out a category-external causative v. In English, the root can 
specify affixal (or null) morphology only for a root-external causative v; 
category-external causatives must be periphrastic, using the default causative 
morphology make. 

5.1 Evidence for a SubjExp Event Head 

Thus far we have mainly been concerned with PsyCaus predicates. What is 
the evidence that SubjExp predicates contain an event head? SubjExp predi- 
cates are more like eventive transitive predicates than like PsyCaus predi- 
cates, in that they fail to show the classic psych-effects. The similarity is 
somewhat puzzling, since eventive transitive and SubjExp predicates differ 
with respect to both causativity and eventivity. However, the two are not 
syntactically identical. Although — in some languages — SubjExp verbs have a 



‘^Another alternative would be to say that the T/SM argument is licensed, not by 
the category-determining event head, but by a separate functional head sandwiched 
between this head and the root. For example, Alexiadou (1999) suggests that an as- 
pectual functional head (Asp) occurs in this position. 
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nominative subject and passivize, just like eventive transitives, in others (e.g., 
Georgian, Icelandic), SubjExp verbs have a “quirky” dative subject, and re- 
sist passivization. Arad (1999) proposes that the experiencer of a SubjExp 
verb is introduced by a third type of v, a stative noncausative v (see Marantz 
1989 for a similar suggestion). I will adopt this proposal here, leaving open 
the question of why SubjExp predicates and transitives often pattern together, 
and against PsyCaus predicates. 

Is it accurate to call the head that introduces the Experiencer of a Subj- 
Exp verb an event head? It was reported above (see example (11b)) that Fin- 
nish melkein ‘almost’ has only one scope with SubjExp verbs. This suggests 
that a SubjExp clause contains only one eventuality, namely the one denoted 
by the lexical root. On closer examination, however, adverb scope options 
appear to admit the possibility of a bi-eventive structure for SubjExp predi- 
cates. Consider the English examples in (51). (51a) could describe a situation 
in which Mimi was undecided about Bob, and was on the point of liking him, 
but then he did something ghastly that destroyed her opinion of him forever. 
Alternatively, it could describe a situation in which Mimi was quite decided 
about Bob, and what she experienced towards him was a feeling approaching 
affection. A similar ambiguity seems to arise in (51b), where the SubjExp 
predicate is adjectival. 

(51) a. Mimi almost liked Bob. 

b. Mimi was almost angry with Bob. 

This ambiguity supports the presence of an event head in SubjExp predicates. 
Let us suppose that the first reading involves modification of the “experi- 
ence” eventuality denoted by the stative noncausative event v, while the sec- 
ond involves modification of the “state of mind” eventuality denoted by the 
root.” 

However, even if SubjExp predicates include two syntactic heads, this 
does not necessarily mean that they contain an event head. Marantz (1989, 
1993) argues that the higher indirect object of a double-object predicate is 
generated in the specifier of a light applicative verb. This verb is realized by 
overt morphology in various Bantu languages, among others. Nevertheless, 
assuming an applicative verb is present in English, it does not require the 
default causative morphology make. (52) shows double-object predicates 
with either a causative affix -en or no overt causative. Little or no overt ap- 



”Of course, this approach predicts that, on closer examination, both adverb 
scopes will turn out to be available in Finnish as well. 
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plicative morphology occurs in English; the applicative head is shown as 
italicized 0 below. 

(52) a. He [^p thick- 0-en-ed [^pp,p me t [yp t some soup]]]. 

b. John [^pbake-0-0-ed [^pp,p Bill t [yp t a cake]]]. 

c. Mary [^pkick-0-0-ed [^pp,p Sue t [yp t the ball]]]. 

Thus, by our assumptions, there are light verbs (such as APPL) that are not 
event heads. However, there is evidence that, by contrast with APPL, the light 
verb introducing the Experiencer of a SubjExp predicate is an event head, 
introducing an “external” argument. 

Georgian provides some evidence for a difference between APPL and the 
SubjExp event head Vp„^. In Georgian, both the indirect object introduced by 
APPL and the Experiencer subject introduced by have dative morphologi- 
cal case (Harris 1981). Moreover, many SubjExp verbs have an affix that is 
morphologically identical to APPL (the “relative prefix” that adds an indirect 
object to a transitive or unaccusative clause). Nevertheless, the Experiencer 
subject behaves differently from the indirect object in several ways. For ex- 
ample, some speakers require the reflexive anaphor tavis tav to be bound by a 
subject. These speakers do not permit the indirect object to bind the anaphor 
(53a), but do permit the Experiencer to do so (53b). Moreover, although the 
dative Experiencer behaves like the syntactic subject, in a passive the indirect 
object does not become a dative subject.^ Instead it appears with the postpo- 
sition -tvis, while the direct object becomes the subject (53c). 

(53) a. nino paTara gela-stavis tav-s 0-acveneb-s sarKeSi. 

N.-NOM little G.-DAT self-DAT APPL-show-PRES mirror-in 
‘Nino, showed little Gela^ herself/*himsel^ in the mirror.’ 

b. temur-s tavis tav-i u-qvar-s. 

T.-DAT self-NOM v-love-PRES 
‘Temur loves himself.’ 

c. vaSl-i micemulia masCavleblis-tvis. 
apple-NOM give.PASS.PRES teacher-for 

‘An apple is given to the teacher.’ (Harris 1981:103) 



“in this it differs from a dative indirect object in Icelandic, which becomes the 
subject in a passive, just like a dative Experiencer (Zaenen, Maling & Thrainsson 
1985 ). 
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It is fairly straightforward to argue that SubjExp verbs have an external 
argument. Such verbs typically show normal transitive behavior, aside from 
the possibility of quirky dative case on the subject. According to B&R, Ital- 
ian SubjExp verbs pattern with transitives, as opposed to verbs with no exter- 
nal argument. For example, as noted above, SubjExp verbs can passivize in 
many languages (54). Passivization is generally considered possible only for 
verbs with an external argument. 

(54) a. Mary was loved by all. ENGLISH 

b. Maija-a inho-taan. FINNISH (Pylkkanen 1998) 

M.-par find.disgusting-PASS 

‘Maija is found disgusting.’ 

c. Gianni e/viene temuto da tutti. ITALIAN (B«feR) 

G. is/comes feared by everyone 

‘Gianni is feared by everyone.’ 

It is more difficult to demonstrate that adjectival SubjExp predicates 
have an event head and a corresponding external argument. However, evi- 
dence for this view can be found from a contrast noted by Burzio (1986) and 
Cinque (1990). These authors point out that adjectival predicates typically 
pattern with unergative verbs, although semantically similar stative verbs are 
unaccusative. For example, the partitive clitic ne ‘of them’ cannot be ex- 
tracted from the subject of the adjectival predicate in (55a); ne-cliticization is 
likewise blocked with unergative verbs. By contrast, the stative verbal predi- 
cate in (55b) is unaccusative, and allows ne-cliticization.^* 

(55) a. * Ne sarebbero sconosciute^ molte (di vittime). 

of-them would be unknown many (of victims) 
b. Ne sarebbero riconosciute^ molte (di vittime). 
of-them would be recognized many (of victims) 

Not all adjectival predicates have external arguments — for example, the sub- 
ject of English likely can be raised from a lower clause (as in Mary is likely 
to win). However, SubjExp adjectives in Italian also block ne-cliticization 

(56) . Thus, adjectival SubjExp predicates appear to have an external argu- 
ment. We can suppose that this argument is introduced by a category- 



^'These examples are quoted from the literature; some of the Italian speakers I 
have checked them with find them quite marginal. 
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determining event head, a, just as the external argument of a verb is intro- 
duced by a category-determining event head, v. 

(56) a. * Ne sarebbero arrabbiate^ molte (di vittime). 
of-them would be angry many (of victims) 
b. * Ne sarebbero impaurite^ molte (di vittime). 

of-them would be afraid many (of victims) (Michela Ippolito, p.c.) 

The reasoning here is as follows; given that SubJExp adjectives and 
verbs are complex predicates, and given that the Experiencer argument is an 
external argument, we can conclude that the functional head that introduces 
the Experiencer is an event head, just as in a regular transitive. If a causative 
is added to a predicate with this event head, it will of course be category- 
external. In English, such a causative must use the default morphology make; 
the null or affixal causative morphology of a PsyCaus verb cannot be used in 
forming a causative of a SubjExp predicate. This, I submit, is the right expla- 
nation of the T/SM restriction. 

5.2 Further Predictions 

If it is true that the T/SM restriction follows in part from the morphological 
properties of English causatives, we can derive a couple of predictions. First, 
we have suggested that null or affixal causative morphology in English is 
always root-external, and that adjectival predicates (often) have an external 
argument introduced by a category-forming event head, a. If so, then affixal 
causatives should usually not attach outside adjective-forming affixes. Sec- 
ondly, we noted that both root-external and category-external causatives are 
affixal in Japanese. We expect the T/SM restriction to hold for root-external 
affixal causatives in Japanese, but not for category-external affixal causa- 
tives. 

The first prediction holds up fairly well. The causative affixes -ify and 
-en are often said to attach to adjectives to form verbs, but these affixes do 
not attach to stems that already have an adjectival affix. For example, a 
search of Webster’s online dictionary reveals that -ify often attaches to bound 
stems (57a), sometimes to stems that can appear in unaffixed form as adjec- 
tives (or nouns) (57b), but never to “derived” adjectives. Causative -en does 



138 



MARTHA MCGINNIS 



not appear to attach to bound stems, but it attaches only to stems lacking a 
suffix (58).^^ 

(57) a. beaut-, fort-, dign-, myst-, Russ-, spec-, transmogr- ... 

b. dense, false, diverse, french, just, prett(y), pure, rare, simple, sol- 
emn, solid, tack(y), ugl(y) 

(58) awake, broad, coarse, deaf, fresh, glad, hard, loose, mad, neat, quiet, 
red, sad, thick, weak... 

However, there are causative suffixes in English that attach outside ad- 
jective-forming suffixes, contrary to the most straightforward prediction. For 
example, English -ize attaches to derived adjectival forms of various kinds 

(59) . Nevertheless, unlike periphrastic make, which can also be added out- 
side an adjectival predicate, -ize does not allow both a Causer and a T/SM 
argument (60c). 

(59) a. -ic: metr-ic, myth-ic, poet-ic. . . 

b. -(u)al. centr-al, palat-al, trib-al, concept-ual, sex-ual, intellect-ual... 

c. -ar: pol-ar, line-ar, singul-ar. . . ' 

d. -(ia)n: America-n, India-n, Russ-ian, grec-ian, ital-ian. . . 

e. -ive: collect-ive, subject-ive, relat-ive. . . 

(60) a. The citizens were terrified of the dictator . 

b. The soldiers terrorized the citizens. 

c. * The soldiers terrorized the citizens of the dictator . 

Although, like causative make, -ize can attach outside some category- 
determining morphology, it is subject to a special restriction. Note that, un- 
like make, -ize never attaches outside a causative head, such as the head that 
introduces the agent Heidi in (61). It can form a root-external causative (61a), 
but not a causative of a causative (61b). 

(61) a. The advice of the pet store made [Heidi gradually accli- 

mate/acclimatize her cats to the weather in Arizona]. 



assume that humid and rigid are in fact underived, despite the existence of the 
apparently related words humor and rigor. I also assume that the verbs bedizen, beto- 
ken, cozen, and open are not analyzed by English speakers as bound roots suffixed 
with -en. 

^^-al and -ar may well be phonologically conditioned allomorphs of the same 
morpheme (Morris Halle, p.c.). 
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b. * The advice of the pet store acclimatized [Heidi gradually (of) her 
cats to the weather in Arizona]. 

Observing that -ize can only attach to Latinate roots or affixes, Pesetsky 
(1995) proposes that -ize cannot attach to a causative because CAUS (here, 
in English is [-Latinate]. We can make the same proposal here for 
The adjective-forming affixes in (59) are [+Latinate], but if is [-Latinate] 
in English, -ize will not attach to it; a category-external causative v will in- 
stead be spelled out with non-affixal causative morphology, like make. 

Note also that although -ize and make are both category-external, they 
may not spell out exactly the same syntactic/semantic features. Lieber (1998) 
argues that -ize is not generic causative morphology, but rather spells out a 
distinct core meaning, which she calls ACT. Although adding a causative to a 
predicate containing produces a semantically and syntacticaly well- 
formed structure, it does not follow that adding ACT does. 

In general, then, the evidence seems to support our first prediction, 
namely that causative affixes in English will not attach outside of adjective- 
forming affixes. Because -ize attaches outside adjective-forming affixes, we 
might expect it to be able to attach outside a^^^, like make. However, the fact 
that -ize cannot attach outside can be attributed to morphological and 
perhaps semantic restrictions on its distribution. Thus the account given suc- 
cessfully predicts that make, and not affixal causatives, can be used to add a 
causative meaning to a predicate with a Causer and a T/SM argument in 
English. 

We now turn to the second prediction, that Japanese root-external causa- 
tives will display the T/SM restriction, while category-external causatives 
will not. This prediction is also borne out. Miyagawa (1980) notes a semantic 
contrast between two causative counterparts of the SubjExp predicate 
odoroku ‘be surprised’. The causative formed with -(s)as, in (62a), has the 
interpretation of a PsyCaus verb, with the Causer directly producing surprise 
in the Experiencer. The causative formed with -(s)ase, as in (62b), has a 
category-external causative interpretation, with the Causer indirectly pro- 
ducing surprise in the Experiencer. For example, in (62a) the actress’s sur- 
prise is a genuine response to the director, while in (62b) it could simply be 
produced for effect, in response to a direction. 

(62) a. Eiga kantoku-ga zyoyuu-o odorok-a^i-ta. 

movie director-NOM actress-ACC surprise-CAUS-PAST 
‘The movie director surprised the actress.’ 
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b. Eiga kantoku-ga zyoyuu-o odorok-«je-ta. 

movie director-NOM actress-ACC surprise-CAUS-PAST 
‘The movie director made the actress be surprised.’ 

In the noncausative SubjExp counterpart, a T/SM argument with dative ni 
can be introduced (63a). However, this argument can only be used with the 
category-external -(s)ase causative (63b), not with the root-external -(s)as 
causative (63c) (Kazuaki Maeda, p.c.). As predicted, the T/SM restriction 
holds in a root-external causative, but not in a category-external causative. 

(63) a. Zyoyuu-ga sono koto-ni odoroi-ta. 

actress-NOM that fact-DAT surprise-PAST 
‘The actress was surprised at that fact.’ 

b. Eiga kantoku-ga zyoyuu-o sono koto-ni odorok-a^e-ta. 
movie director-NOM actress-ACC that fact-DAT surprise-CAUS-PAST 
‘The movie director made the actress surprised at that fact.’ 

c. * Eiga kantoku-ga zyoyuu-o sono koto-ni odorok-a 5 /-ta. 

movie director-NOM actress-ACC that fact-DAT surprise-CAUS-PAST 
‘The movie director surprised the actress at that fact.’ 

(63c) is apparently well-formed semantically, given that both types of causa- 
tive allow an additional “causer” argument to be introduced by the particle 
de: 

(64) a. Eiga kantoku-ga zyoyuu-o sono koto-de odorok-a^e-ta. 

movie director-NOM actress-ACC that fact-b/c surprise-CAUS-PAST 
‘The movie director made the actress surprised because of that fact.’ 
b. Eiga kantoku-ga zyoyuu-o sono koto-de odorok-a5/-ta. 

movie director-NOM actress-ACC that fact-b/c surprise-CAUS-PAST 
‘The movie director surprised the actress because of that fact.’ 

The behaviour of Japanese causatives supports our second prediction: the 
T/SM restriction holds only in a root-external causative, even if the category- 
external causative is also affixal. 

6 Conclusions 

I have argued here that the T/SM restriction arises from two causes. First, the 
Target or Subject Matter argument is licensed, not of the root, but by the 
noncausative stative event head occurring in SubjExp predicates, which de- 
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termines the category of the predicate, and conveys the semantics of percep- 
tion or Thus, a T/SM argument can arise only in the presence of 
such a head. Secondly, adding a Causer to a predicate with a category- 
determining head generally blocks the use of null or affixal causative mor- 
phology in English, so only a periphrastic causative can be used when both 
the Causer and T/SM arguments are present. PsyCaus verbs are root-external 
causatives, involving only one event head (the causative v), so English allows 
null or affixal causative morphology here. In Japanese, however, a category- 
external causative can also use affixal morphology. There the T/SM restric- 
tion arises only with root-external affixal causatives, and not with category- 
external affixal causatives. 

The approach sketched here makes it possible to preserve the view that 
A-movement respects locality; as such, it is worth pursuing further. 
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Echo Reduplication in Kannada: 
Implications for a Theory of Word Formation* 



Jeffrey Lidz 



1 Introduction 

According to the Lexicalist Hypothesis, morphological structure is built in 
the lexicon by processes distinct from those that build syntactic structure. The 
structure of morphologically complex words is erased upon insertion into a 
syntactic phrase-marker and hence, is invisible to sentence-level operations 
and descriptions (Chomsky 1981, DiScullo and Williams 1987, Kiparsky 
1982, Mohanan 1981). Hand in hand with this morphosyntactic hypothesis 
are the following morphosemantic and morphophonological claims. First, 
some structure-meaning correspondences are created in the lexicon and hence 
are idiosyncratic, as in (la, b), while others are created in the syntax and 
hence are transparently compositional, as in (Ic). 

(1) a. /k£Bt/ = CAT 

b. /trans-i-mit-i-ion/ = PART OF A CAR 

c. a cat sleeps = SLEEP(CAT) 

Second, some phonological rules apply in the lexicon, and hence can have 
idiosyncratic properties (e.g., English trisyllabic laxing: (2a) vs. (2b)), while 
others apply postsyntactically (or everywhere) and hence are exceptionless 
(e.g., English flapping: (3a) vs. (3b)). 

(2) a. ser[ij]n : ser[e]nity 
b. ob[ij]s : ob[ij]sity 

(3) a. sea[D]ed 

b. Have a sea[D]. I’ll be right back. 



"Subject to the usual disclaimers, I thank the following people for advice, discus- 
sion, criticism and harassment during the preparation of this paper: R. Amritavalli, 
Tonia Bleam, S. Chandrashekar, Heidi Harley, Bill Idsardi, Alec Marantz, Martha 
McGinnis, Rolf Noyer, Sharon Pepperkamp, Colin Phillips and Alexander Williams. 
A previous incarnation of these ideas was presented at the 1999 Linguistic Society of 
America Annual Meeting. 
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A corrollary of the lexicalist hypothesis is that there should be converg- 
ing criteria which distinguish words from constituents of larger size. We ex- 
pect various measures of wordhood to lead us to the same object. The domain 
of semantic idiosyncracy should be the same as the domain of phonological 
idiosyncracy. Recent work in the framework of Distributed Morphology 
challenges lexicalism by showing that there is no single object that is defined 
by these various criteria (Marantz 1997, Noyer 1998). The elements with 
idiosyncratic meaning are not the same as the elements defined phonologi- 
cally as words. Neither of these, in turn, correlates with the domain of non- 
productive morphological rules. Hence, these authors conclude that there is 
no well-defined category of word, and so a lexicalist grammatical architec- 
ture in which idiosyncratic semantic, syntactic and phonological properties 
are stored together in a single lexicon becomes less plausible. 

This paper adds to the arguments against lexicalism by focusing on the 
syntactic properties of a morphological rule in Kannada traditionally referred 
to by Dravidianists as Echo Reduplication (Emenau 1938).* I will show that 
Echo Reduplication (ER) in Kannada applies equally to words, subparts of 
words and entire syntactic phrases.^ Because ER can apply to phrasal catego- 
ries, we must conclude that it applies post-syntactically; it takes syntactic 
structures as input and returns morphological forms. Given that it also applies 
to morphological units which form subparts of words, we conclude that these 
units are also visible post-syntactically. That is, the internal, sub-word, 
structure must be visible at the same point as the phrasal structure. Hence, a 
theory in which word-internal structure is erased prior to the construction of 
phrases becomes more difficult to maintain. The alternative to the lexicalist 
theory is one in which syntax provides the input to the morphological com- 
ponent, as in the Distributed Morphology framework. On this view all struc- 
ture composition takes place in the syntax, which in turn is read by the mor- 
phological module. 

It is important to observe, however, that there are morphological struc- 
tures which do not allow ER to apply inside of them, suggesting that some 
morphological structure is not phrase-structurally represented. Hence, we 
have evidence that some amount of morphological structure can be seen as 



‘This kind of mle is usually called “fixed melody reduplication” in the generative 
phonological tradition. See, for example, McCarthy 1982, Marantz 1982, Yip 1992, 
Jha. Sadanand and Vijayakrishnan 1997 for morphophonological analysis. 

^Unless noted otherwise, all Kannada data were collected in 1998 and 1999 from 
R. Amritavalli, S. Chandrashekar and S. Vedantam. Special thanks to R. Armitavalli 
for her time and careful assistance in the constmction of these data. 
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syntactic structure and that some amount of morphological structure cannot. 
If the morphological structure that is not phrase-structural were to correspond 
to some other criteria of lexical item, then we would be able to maintain the 
lexicalist hypothesis. It does not, however. This leaves us with the question 
of how to distinguish those pieces of morphological structure that allow ER 
to apply inside of them from those that do not in a theory without a tradi- 
tional lexicon, such as Distributed Morphology. I propose that the relevant 
distinction is between apparent ‘morphemes’ which are added to the root 
inside a postsyntactic morphological component and those which are added 
to the root by syntactic composition. 

The paper proceeds as follows. In section 2, I will introduce ER, de- 
scribing the environments in which it can apply and the problems that these 
data pose for various versions of the lexicalist hypothesis. In section 3, 1 pre- 
sent some other possible analyses of ER that maintain the lexicalist hypothe- 
sis and I show why these fail to account for the data adequately. In section 4, 
I present an additional argument from affix ordering against a lexicalist 
analysis of ER. Finally, in section 5, I outline an analysis of the apparent ex- 
ceptions to the rule of ER. 



ER in Kannada repeats an element, replacing the first CV with gi- or gi:- 
(depending on the length of the input vowel), and yields a meaning of ‘and 
related stuff (reduplicant glossed as red):^ 



^Although this paper is not concerned with giving a phonological analysis of ER, 
phonologically minded readers will want to know what happens when a word begin- 
ning with gi- undergoes ER. Four informants gave four different answers to this 
question. One speaker said that ER applies to such words just as it would to any other 
word. Hence, we find: giDa ‘plant’ -> giDa-giDa. A second speaker said that the first 
consonant of the reduplicant must change to either b or v: giDa-biDa, or giDa-viDa. 
The third speaker agreed with both of the other two speakers in allowing either sub- 
stitution or not and also said that some speakers may simply be unable to reduplicate 
such a word at all. The fourth speaker requires the fixed melody to be changed to pa: 
giDa-paDa. See Jha et al. 1997 for a phonological analysis of ER in various Indian 
languages. Also see Trivedi 1990 for a typology of ER in India. 



2 The Facts 



(4) a. pustaka 



book 

‘book’ 



b. pustaka-gistaka 
book- RED 

‘books and related stuff 
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ER can apply to all classes of words except interrogative pronouns and de- 
monstrative adjectives (Sridhar 1990). In (4) we see ER applying to a noun; 
in (5), a verb; in (6), an adjective; and, in (7) a preposition: 



(5) a. ooda 
run 
‘run’ 



b. ooda*giida beeDa 
run-RED PROH 
‘Don’t run or do related activities.’ 



(6) a. doDDa 
large 
‘large’ 



b. doDDa-giDDa 

large-RED 
‘large and the like’ 



(7) a. meele 
above 
‘above’ 



b. meele-giile 
above-RED 
‘above and the like’ 



ER may apply either inside ((8a), (9a)) or outside ((8b), (9b)) of inflectional 
elements 

(8) a. baagil-annu much>gich*id>e anta heeLa-beeDa 

door-ACC close-RED-PST-lS that say-PROH 

‘Don’t say that I closed the door or did related activities.’ 

b. baagil-annu much*id*e-gichide anta heeLa-beeDa 

door-ACC close-PST-lS-RED that say-PROH 

‘Don’t say that I closed the door or did related activities.’ 



(9) a. baagil*giigil*annu much-id-e 
door-RED-ACC close-PST- 1 s 

‘I closed the door and related things.’ 

b. baagibannu’giigilannu much-id-e 
door-ACC-RED close-PST- Is 

‘I closed the door and related things.’ 

Entire phrasal categories may be reduplicated by ER: 



“K.G. Vijayakrishnan (personal communication) reports that Tamil, a closely 
related Dravidian language, does not allow ER to apply inside of inflectional ele- 
ments. 
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(10) a. nannu baagil-annu much*id-e giigilannu muchide 
I-NOM door-ACC close-PST-lS RED 

anta heeLa-beeDa 
that say-PROH 

‘Don’t say that I closed the door or did related activities.’ 

b. pustav-annu meejin-a meele giijina meele nooD-id-e 
book- ACC table-gen on RED see- PST- Is 

‘I saw the book on the table and in related places.’ 

The data in (8-10) are problematic for the strictest variant of the lexi- 
calist hypothesis, namely one in which all morphological composition takes 
place in the lexicon. To my knowledge, no-one has ever explicitly held such 
a position (but see Chomsky 1993, which may hold it implicitly). The reason 
such data are problematic for the staunch lexicalist is that the rule applies 
equally to subword and phrasal constituents, an impossibility if the internal 
morphological structure is erased upon insertion into the syntactic phrase- 
marker. 

2.1 Variants of Weak Lexicalism 

2.1.1 Derivation = Lexical. Inflection = Syntactic 

One step back from the staunch lexicalist is the weak-lexicalist, who would 
hold that derivation and inflection are distinguished with respect to the lexi- 
con. On this view, derivational morphology applies inside the lexicon while 
inflectional morphology applies outside the lexicon (Anderson 1984, 1992). 
The weak lexicalist would expect a syntactic rule of ER to be able to capture 
the facts given in (8-10), but would predict that ER would not be able to 
reach into complex words formed by rules of derivational morphology. 

In (11-13) we see that ER can apply either inside or outside of valency 
changing morphology, prototypically considered to be derivational/lexical 
(Grimshaw 1982, Lieber 1980, Selkirk 1982, DiSciullo and Williams 1987):^ 



’See Lidz (1998) for arguments that the reflexive and causative morphology of 
Kannada is not added to a root inside the lexicon. 



150 



JEFFREY LIDZ 



(11) Anticausative use of reflexive 

a. muchu 
close 

‘to close (tr.)’ 

b. muchi-koLLu 
close-REFL 

‘to close (intr.)’ 

c. baagilu muchi-gichi-koND-itu anta heeLa-beeDa 
door-NOM close-RED-REFL.PST-3SN that say-PROH 
‘Don’t say that the door closed or did related things.’ 

d. baagilu muchi-koND-itu-gichikoNDitu anta heeLa-beeDa 

door-NOM close-REFL.PST-3SN-RED that say-PROH 

‘Don’t say that the door closed or did related things.’ 

(12) Reflexive use of reflexive 

a. hogaLu 
praise 

‘to praise’ 

b. hogaLi-koLLu 
praise- REFL 

‘to praise oneself.’ 

c. rashmi tann-annu hogaLi-gigaLi-koND-aLu anta heeLa-beeDa 

Rashmi self-ACC praise-RED-REFL.PST-3SF that say-PROH 

‘Don’t say that Rashmi praised herself and did related activities.’ 

d. rashmi tannannu hogaLi-koND-aLu-gigaLikoNDaLu 

Rashmi self-ACC praise-REFL.PST-3sF-RED 

anta heeLa-beeDa 
that say-PROH 

‘Don’t say that Rashmi praised herself and did related activities.’ 
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(13) Causative 

a. kaTTu 
build 

‘to build’ 

b. kaTT-isu 
build-CAUS 

‘to make build’ 

c. naanu mane-yannu kaTT-giTT-is-id-e anta heeLa-beeDa 
I-NOM house-ACC build-RED-CAUS-PST-lS that say-proh 
‘Don’t say that I had a house built and did related activities.’ 

d. naanu mane-yannu kaTT-isi-giTTis-id-e anta 
I-NOM house-ACC build-CAUS-RED-PST-lS that 

heeLa-beeDa 

say-PROH 

‘Don’t say that I had a house built and did related activities.’ 

e. naanu mane-yannu kaTT-is-id-e-giTTiside anta 
I-NOM house-ACC build-CAUS-PST-lS-RED that 

heeLa-beeDa 

say-PROH 

‘Don’t say that I had a house built and did related activities.’ 

Similarly, ER can occur inside or outside of category changing morphol- 
ogy, such as the verbalizing use of the causative morpheme or the deadjecti- 
valizing pronominal affixes. 

(14) Verbalizing use of causative 

a. patra 
letter 
‘letter’ 

b. patr-isu 
letter-CAUS 

‘to write a letter’ 
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c. Rashmi Vijay-ige patra-gitr-is-id-aLu anta heeLa-beeDa 

Rashmi Vijay-DAT Ietter-RED-CAUS-PST-3SF that say-PROH 

‘Don’t say that Rashmi wrote Vijay a letter and did related 
activities.’ 

d. Rashmi Vijay-ige patr-is-gitris-id-aLu anta heeLa-beeDa 

Rashmi Vijay-DAT ^tter-CAUS-RED-PST-3SF that say-PROH 

‘Don’t say that Rashmi wrote Vijay a letter and did related 
activities.’ 

(15) Deadjectival nouns 

a. cikka 
small 
‘small’ 

b. cikk-avanu 
small-he 

‘one who is small.’ 

c. avanu cikk-gikk-avanu alia 

he-NOM small-RED-he NEC 

‘It’s not as if he’s a young etc. man.’ 

d. avan-annu cikk-avanu>gikkavanu anta heeLa-beeDa 

he-ACC smaU>he-R£D that say-PROH 

‘Don’t say that he’s a young man and such.’ 

These data are problematic for the weak-lexicalist because in them, ER 
treats the substructures of words with derivational morphology as equivalent 
to the substructures of words with inflectional morphology and entire syntac- 
tic phrases. Hence, a view in which derivation is lexical but inflection is 
syntactic will not divide the world in a way consistent with the demands of 
ER. 

It is important to note at this point that there are some domains in which 
ER may not apply. Consider the examples in (16-20), in which ER cannot 
apply inside of certain affixes. 

(16) a. toor-ike 

show-NMNL 

‘appearance’ 
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b. * toor-giir-ike 

show-red-nmnl 

c. toor-ike giirike 
show-nmnl RED 
‘appearances and related things’ 

(17) a. tooru-vike 

show-GER 

‘showing’ 

b. * toor-giiru-vike 

show-RED-GER 

c. tooruvike giiruvike 
show-ger RED 

‘showing and related activities’ 

(18) a. ooD-aaTa 

run-play 
‘running around’ 

b. * ooD-giiD-aaTa 

run-RED-play 

c. ooD-aaTa giiDaaTa 
run-play red 

‘running around and related activities’ 

(19) a. hoogu-vudu 

go-GER 

‘going’ 

b. * hoog-giig-uvudu 

go-RED-GER 

c. hoogu-vudu giiguvudu 
go-GER RED 

‘going and related activities’ 
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(20) a. doDDa-tana 
large-NOM 
‘largeness’ 

b. * doDD-giDDa-tana 

large-RED-nom 

c. doDDatana giDDatana 
large-NOM red 

The fact that ER cannot apply inside of certain derivational affixes sug- 
gests that weak lexicalism may be right in saying that some morphological 
operations are syntactically represented while others are not, but wrong in 
making the division correspond to the division between derivation and in- 
flection (perhaps suggesting that such a distinction is not real). We return to 
this question below. 

2.1.2 Idiosyncratic = Lexical. Compositional = Syntactic 

An alternative variant of weak lexicalism might say that the distinction be- 
tween lexicon and syntax is not reflected in the difference between derivation 
and inflection, but rather in the difference between the idiosyncratic and the 
compositional. On this view, we might expect ER to be able to reach only 
inside of semantically compositional structures, but not inside of noncompo- 
sitional structures. This hypothesis is immediately called into question by the 
fact that ER can apply to the internal elements of idiomatic expressions, as 
demonstrated in (21) and (22). 

(21) a. Hari kannu much-id-a 

Hari eye close-PST-3SM 

‘Hari died.’ (lit. Hari closed his eyes) 

b. Hari kannu>ginnu much-id-a 

c. Hari kannu muchida ginnu muchida 

(22) a. Rashmi Hari-ge maNNu tinn-is-id-aLu 

Rashmi Hari-DAT mud eat-CAUS-PST-3SF 
‘Rashmi ruined Hari.’ (lit. Rashmi made Hari eat mud) 

b. Rashmi Hari-ge maNNu giNNu tinn-is-id-aLu 



ECHO REDUPLICATION IN KANNADA 



155 



c, Rashmi Hari-ge maNNu tinn-is-id-aLu giNNu tinnisidaLu 

The existence of phrasal idioms like (21a) and (22a) is potentially problem- 
atic for the lexicalist hypothesis by themselves because they show that the 
domain of semantic idiosyncracy does not correspond to the morphopho- 
nological word. While this problem does not seem to alarm lexicalists (cf. 
Jackendoff 1997), the fact that ER treats the subparts of syntactic idioms on a 
par with the subparts of syntactic phrases may. The fact that ER treats the 
subparts of semantically non-decomposable chunks on a par with the subparts 
of semantically decomposable chunks suggests that a grammar which sepa- 
rates the lexicon from the syntax on the basis of semantic idiosyncracy em- 
bodies the wrong architecture. 

The problems for a variant of lexicalism that takes idiosyncracy to be the 
hallmark of the lexicon can also be seen by examining the distinction be- 
tween "word-level" and "stem-level" affixation. Aronoff and Sridhar (1983) 
show that the distinction between word-level and stem-level affixation in 
Kannada is diagnosed by a correspondence between epenthetic [u] (Bright 
1972) and semantic transparency. They demonstrate the correlation by ex- 
amining the properties of the nominalizing suffix -ike. When attached at the 
stem-level, there is no epenthetic [u] and the meaning of the derived form is 
idiosyncratically related to the base. On the other hand, when this affix is 
attached at the word-level, there is an epenthetic [u] and the derived form is 
transparently a gerund. Moreover, there are some verbs for which there is no 
stem-level variant, whereas all verbs have a word-level, gerundive variant. 



(23) 


verb 


gloss 


+ike 


gloss 


#ike 


gloss 


a. 


beeDu 


‘beg’ 


beeDike 


‘plea’ 


beeDuvike 


‘begging’ 


b. 


jaaru 


‘slide’ 


jaarike 


‘slipperiness’ 


jaaruvike 


‘sliding’ 


c. 


keeLu 


‘ask’ 


kaaLike 


‘request’ 


kaaLuvike 


‘asking’ 


d. 


tooru 


‘show’ 


toorike 


‘appearance’ 


tooruvike 


‘showing’ 


e. 


horaDu 


‘leave’ 


*hooraDike 


horaduvike 


‘leaving’ 



Now, if we take a variant of the lexicalist hypothesis to hold that productive 
morphological rules with transparent meaning are syntactic while nonpro- 
ductive morphological rules with idiosyncratic meaning are lexical, then we 
would expect to find ER able to apply inside of gerundive -ike but not inside 
of the stem-level variant of this affix. 
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The data come out otherwise. ER is not possible inside of either variant 
of -ike, a problem to which we will return. 

(24) a. toor-ike 

show-NMNL 

‘appearance’ 

b. * toor-giir-ike 

c. toorike giirike 

(25) a. tooru-vike 

show-GER 

‘showing’ 

b. * tooru-giiru-vike 

c. tooruvike giiruvike 

Even worse for this variant of lexicalism is that there are both stem-level 
and word-level affixes that ER can apply inside of, such as the causative -isu 
and the plural -gaLu, respectively: 

(26) a. beeD-isu 

beg-CAUS 
‘to cause to beg’ 

b. * beeDu-visu 

c. beeD-giiD-isu 
beg-RED-CAUS 

‘to cause to beg and related activities’ 

d. beeD-isu-giiDisu 
beg-CAUS-RED 

‘to cause to beg and do related activities’ 

(27) a. kaalu-gaLu 

leg-PL 

‘legs’ 



b. * kaaligaLu 
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c. kaalu-giilu-gaLu 
leg-RED-PL 
‘legs and stuff 

d. kaalu-gaLu-giilugaLu 
leg-PL-RED 

‘legs and stuff 

We can conclude that neither the distinction between stem-level and 
word-level affixation, nor the related distinction between semantically idio- 
syncratic and semantically transparent affixation gives us a way to determine 
which affixes ER can apply inside of and which it cannot. 

3 Some Less Plausible Lexicalist Solutions 

3.1 Two Rules 

One possibility for maintaining lexicalism given that ER applies equally to 
subparts of words and entire phrases would be to posit two rules of ER. On 
this view, there are two separate but identical rules of reduplication, one ap- 
plying in the lexicon (to sublexical material) and a second applying in the 
syntax (to lexical and phrasal material). 

The problem with the two rules gambit is that it is redundant. Giving up 
the Lexicalist Hypothesis in favor of a theory in which morphologically 
complex words are syntactically complex allows us to explain ER with one 
rule which applies to any syntactic constituent. 

3.2 ER is Phonological 

A second possibility for maintaining the Lexicalist Hypothesis would be to 
say that ER is phonological. A phonological analysis of ER, in which the 
elements which can undergo reduplication are all of the same phonological 
category, would circumvent the lexicalist objection by showing that the rule 
has no morphosyntactic relevance. 

This tack is problematic for three reasons. First, there is no single 
phonological constitutent represented by the elements which can undergo ER. 
That is to say, given a single input like (28a), the rule produces three outputs! 
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(28) a. kaTT-is-id-e 

build-CAUS-PST-lS 

b. kaXT-giTT-is-id-e 
build-RED-CAUS-PST- Is 

c. kaTT-isi-giTTis-id-e 
build-CAUS-RED-PST- IS 

d. kaXT-is-id-e-giTTiside 
build-CAUS-PST- Is-RED 

ER can apparently decide to break the word at its any of its morpheme 
boundaries, irrespective of phonological constituency. Xhis point is especially 
clear, when we examine a word whose morphological structure differs from 
its phonological structure. Consider (29), with the morphological structure in 
(29b) and the syllabification in (29c): 

(29) a. hogaLikoNDaLu 

‘she praised herself.’ 

b. [[[hogaLi]-koND] -aLu] 

praise -REFL.PST-3SF 

c. ho.ga.Li.koN.Da.Lu 

Xhe three possible outputs of ER given (29a) are those in (30). 

(30) a. hogaLi-gigaLi-koND-aLu 

b. hogaLi-koND-gigaLikoND-aLu 

c. hogaLi-koND-aLu-gigaLikoNDaLu 

Xhese correspond to the morphological constituents of (29). Impossible ERs 
of (29a) are given in (31). 



(31) a.* ho-gi-gaLikoNDaLu 
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b. * hoga-giga-LikoNDaLu 

c. hogaLi-gigaLi-koNDaLu (=(30a)) 

d. * hogaLikoN-gigaLikoN-DaLu 

e. * hogaLikoNDa-gigaLikoNDa-Lu 

The reduplications in (31) are the outputs of an ER rule applied to (groups of) 
syllables. For example, (31a) reduplicates just the first syllable, (31b) redu- 
plicates the first two syllables, etc. None of these is a possible reduplication 
(with the exception of (31c) which corresponds to a morphological break as 
well as a phonological one), despite the fact that any of them could poten- 
tially occur if syllables (or larger prosodic units made up of syllables) were 
the units over which the rule applied. 

A bigger problem for the phonological analysis is that the rule respects 
morphological and syntactic constituency. In the ungrammatical (32), just the 
nonroot elements of the verb are reduplicated. These morphemes do not form 
a morphosyntactic constituent and so this reduplication is barred. 

(32) * hogaLi-koND-aLu-giNDaLu (cf. (29b)) 

In (33c), a hypothesized phrasal reduplication of (33a) (whose structure 
is (33b)), we see that it is ungrammatical to reduplicate the subject and object 
to the exclusion of the verb, despite the fact that these elements are adjacent 
in the string. Only syntactic constituents can be reduplicated. 

(33) a. Rashmi avan-annu hogaL-id-aLu 

Rashmi he-ACC praise-PST-3SF 

‘Rashmi praised him.’ 

b. Rashmi [^t. [^p avan-annu hogaL- ] id-] aLu] 

c. * Rashmi avan-annu gishmi-avanannu hogaL-id-aLu 

Rashmi he-ACC RED praise-PST-3SF 

Intended: ‘Rashmi and related people praised him and related 
people.’ 

An additional problem with the phonological analysis of ER is that ER is 
syntactically and semantically restricted when it involves a predicate (V or 
VP). A predicate may undergo ER only if it is embedded under a modal ele- 
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ment, such as prohibitive (negative imperative (=(34)), negation (=(35a,b)), 
question-morpheme (=(35b,c)), etc.: 

(34) a. * baagil-annu much-gich-id*e 

door- ACC close-RED-PST- 1 s 

‘I closed the door and did related activities.’ 

a', baagil-annu much-gich-id-e anta heeLa-beeDa 
door-ACC close-RED-PST-lS that say-PROH 
‘Don’t say that I closed the door and did related actitivites.’ 

b. * baagil-annu much-id-e gichide 

door-ACC close-PST-lS RED 

‘I closed the door and did related activities.’ 

b'. baagil-annu much-id-e gichide anta heeLa-beeDa 
door-ACC close-PST-lS-RED that say-PROH 
‘Don’t say that I closed the door and did related activities.’ 

c. * naanu baagil-annu muchide giigilannu muchide 

I-NOM door-ACC close-PST-lS RED 
‘I closed the door and did related activities.’ 

c'. naanu baagil-annu muchide giigilannu muchide 
I-NOM door-ACC close-PST-lS RED 

anta heeLa-beeDa 
that say-PROH 

‘Don’t say that I closed the door and did related activities.’ 

d. baagil-annu-giigilannu muchide 

door-ACC-RED close-PST- 1 s 

‘I closed the door and related things.’ 

(35) a. hari baagilannu muchi-gich-al-illa 

Hari door-ACC close-RED-INF-NEG 

‘Hari didn’t close the door or do any such thing.’ 

b. niinu baagil-annu muchi-gich-al-illa-valla-a 
you door-ACC close-RED-INF-NEG-TAG-Q 
‘You didn’t close the door or do any such thing, did you?’ 
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c. hari baagil-annu muchi-gich-id-a-a 
Hari door-ACC close-RED-PST-3SM-Q 
‘Did Hari close the door or do any such thing?’ 

Given that the same phonological material can be reduplicated success- 
fully in some syntactic/semantic contexts but not in other syntactic/semantic 
contexts, a strictly phonological analysis is untenable. 

4 Level Ordering, ER and the Lexicalist Hypothesis 

The distinction between word-level and stem-level affixation gives us an 
additional argument for morphological structure being syntactically visible. 
The argument grows out of A&S’s observation that word-level affixation can 
apply inside of stem-level affixation in Kannada.® A&S’s discussion is based 
on two suffixes: the dative -ge and the plural -gaLu. 

First, all forms to which -gaLu attaches can occur as free forms whereas 
the same is not true of forms to which -ge attaches. 







singular 


plural 


dative 


a. 


‘house’ 


mane 


manegaLu 


manege 


b. 


‘rock’ 


banDe 


banDegaLu 


banDege 


c. 


‘leg’ 


kaalu 


kaalugaLu 


kaallge *kaali 


d. 


‘forest’ 


kaaDu 


kaaDugaLu 


kaaDige *kaaDl 



In (36c-d), both the [u] in the singular and plural forms and the [I] in the 
dative are epenthetic. The [u] is added word finally to all consonant final 
stems, as can be seen clearly in borrowings of consonant final words: 



(37) a. 


‘spoon’ 


spuunu 


b. 


‘car’ 


kaaru 


c. 


‘pen’ 


pennu 


d. 


‘bus’ 


bassu 



From this A&S conclude that -gaLu is a word-level affix because the 
same epenthetic vowel occurs on stems to which it attaches as on whole 
words. The [u] of -gaLu is this same epenthetic vowel. This can be seen 
when we add casemarkers to a plural word. In such an environment the 




®See Aronoff (1976) for the same observation in English. 

lb’7 
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epenthetic [u] does not occur. Moreover, when we add a consonant initial 
casemarker it is the epenthetic [I] which occurs. 

(38) a. ‘car-PL-ACC’ kaaru-gaL-annu 

b. ‘car-PL-DAT’ kaaru-gaLl-ge 

Now, the fact that the stem-level dative (and other casemarkers, as evi- 
denced by the epenthesis facts) occurs outside of the word-level plural leads 
A&S to conclude that there is no level-ordering in the sense of Mohanan 
(1981) and Kiparsky (1982). They don’t deny that the levels exist but only 
claim that there is no ordering and no bracket erasure. 

A«feS’s conclusion is lexicalist in nature because it assumes that there are 
different levels of affixation in the lexicon. There is an alternative analysis, 
of course, which posits that the difference between the stem-level and word- 
level affixes is stated not in terms of levels, but in terms of boundary sym- 
bols, as in Chomsky and Halle (1968). The important finding of A&S is that 
there are two kinds of boundaries and that there are no ordering restrictions 
on these boundaries. They assume that these are types of lexical boundaries, 
though nothing they say forces this conclusion. The crucial result is only that 
the boundaries are visible simultaneously. 

Now, given the observation that ER can apply to syntactic phrases as 
well as to sub-word constituents and the observation that word-level and 
stem-level boundaries must be visible simultaneously, we are led to the con- 
clusion that these levels are syntactically represented. That is, A&S tell us 
that the two types of boundaries are marked at the same level, but are agnos- 
tic as to whether this is in the lexicon or in the syntax. Given that ER can (a) 
reach inside of these boundaries and (b) apply to syntactic phrases, we are led 
to conclude that the two types of boundaries are syntactically, and not lexi- 
cally, represented. 

5 When Echo Does Not Apply 

This section provides a first step towards determining whether there is any 
systematicity in which affixes are syntactically represented. As we have seen, 
using ER as a test leads us to conclude that certain cases of apparent affixa- 
tion are not syntactically complex. To account for these facts, a view in 
which all morphology is postsyntactic, such as Distributed Morphology, will 
require that some morphological structure is represented phrase-structurally 
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and other morphological structure is due to nonstructural aspects of the syn- 
tax. 

Consider, as an illustration, Marantz’s (1997) reinterpretation of Chom- 
sky’s (1970) arguments about nominalization. Marantz’s hypothesis takes it 
that the relation between a verb and its nominalization is based on syntactic 
category only. There is a single root whose pronunciation depends upon its 
syntactic category. In other words, a nominalization is simply what you get 
when you put a root of a certain type in the nominal environment; if you were 
to put this root in a verbal environment, you would have gotten a verb. There 
is no transformation from one to the other. For example, the root '^destr- in 
the verb context will be pronounced destroy and in the noun context will be 
pronounced destruction. On this view, it is not the case that —tion is an affix 
heading its own piece of phrase structure (or morphological structure). 
Rather, the environment of the root determines whether it will be pronounced 
with the -tion affix. The simple fact of being dominated by an N node deter- 
mines whether this affix is present. Here, the syntax determines the pronun- 
ciation, but by feature, not by configuration. In other words, under the Ma- 
rantz-Chomsky hypothesis, the root 'Jdestr- has the following morphological 
properties: 

(39) a. 'Jdestr- <-> [„ destruction] 

b. ^Idestr- <-> destroy] 

Hence, the factor determining how the root is realized is the syntactic 
category of the word, not its syntactic structure. In fact, it has no syntactic 
structure. The ‘affixes’ which appear on the root arise because of the syntac- 
tic environment but are not explicitly represented as nodes in a nested tree- 
structure. 

Other affixes, of course, quite clearly are syntactic heads and the facts of 
ER give us a way to determine which ones these are (in Kannada). ER can 
tell us which affixes are present because they correspond to independent 
heads in the phrase structure and which are present because of categorical 
properties of the context. In other words, given the conclusion that morphol- 
ogy applies postsyntactically and the fact that some affixes appear to be 
phrase-structurally represented while others do not, we are led to the conclu- 
sion that some apparent affixes occur because of aspects of the syntactic en- 
vironment which are not part of the nested tree-structures we take to be the 
core of syntactic combination. 

The two kinds of “affixation” are illustrated in (40). 
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patr-isu 
letter-CAUS 
‘to write a letter’ 

toor-ike 
show-NMNL 
‘appearance’ 

Because ER can reach inside of a morphologically complex word like 
(40a), we take the boundary between the morphemes to be syntactically rep- 
resented. The root and the affix each head their own pieces of phrase struc- 
ture, as in (41); 

(41) 

N V 

I I 

patra -isu 

ER cannot apply inside of the morphologically complex(40b), as we have 
seen, and so its syntactic structure is nonbranching; 

(42) N 
toor- 

This root is listed in the morphological component as having two alter- 
native pronunciations depending on its syntactic category, as in (43); 

(43) a. ^toor- <-> toorike] 
b. ^toor- <-> [y tooru] 

The appearance of the “morpheme” [-ike] is determined by the morphologi- 
cal component and does not correspond to a piece of syntactic structure. 

We can conclude that a theory of morphology which takes all cases of 
morphological complexity to correspond to syntactic complexity is too strong 
to account for the data. On the other hand, a theory which recognizes both an 
independent morphological module and a syntactic module of phrase- 
structure composition can make the appropriate discrimination to account for 
the observed pattern of facts in Kannada. Whether there is any systematicity 




O 

ERIC 
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to the set of affixes which do not correspond to pieces of syntactic structure 
and whether there is any relationship between these affixes and any other 
phonological, syntactic or semantic properties remains to be investigated. 

6 Conclusions 

ER is a postsyntactic rule which, on the whole, does not distinguish between 
word-internal and word-external structure, suggesting that such a distinction 
is unneccessary. On this view, morphological complexity generally corre- 
sponds to syntactic complexity. We have noted, however, that certain cases 
of apparent affixation are not syntactically complex. A view in which all 
morphology is postsyntactic, such as Distributed Morphology, will require 
that some morphological structure is represented phrase-structurally and 
other morphological structure is due to nonstructural aspects of the syntax. 
This theory is superior to a lexical theory which treats the word formation 
component as wholly distinct from the syntactic component. It is also supe- 
rior to a theory which eliminates a morphological component altogether by 
subsuming the functions of morphology into the syntax. 
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The Distribution of the Old Irish Infixed Pronouns: 
Evidence for the Syntactic Evolution 
of Insular Celtic?* 

Ronald Kim 



1 Infixed Pronouns in Old Irish 



One of the most peculiar features of the highly intricate Old Irish pronominal 
system is the existence of three separate classes of infixed pronouns used 
with compound verbs. These sets, denoted as A, B, and C, are not inter- 
changeable: each is found with particular preverbs or, in the case of set C, 
under specific syntactic conditions. Below are listed the forms of these pro- 
nouns, adapted from Strachan (1949:26) and Thurneysen (1946:259-60), ex- 



cluding rare variants: 
A 

sg. 1 -m(m)’ 

2 -t’ 

3m. -a n-, -0 n- 
f. -s (n-) 
n. -fl’, -0’ 

pi. 1 -n(n) 

2 -b 

3 -5 (nr) 



B 

-dom’, -dum’, -dam(m)’ 
-tom’, -turn’, -tam(m)’ 
-tot’, -tut’, -tat’, -t’ 

-t n- 

-da h-, -ta h- 
-t’ 

-don, -ton, -tan(n)-don, ■ 
-dob, -dub, -tob, -tab 
-da h-, -ta h- 



C 

-dom’, -dam’ 

-dot’, -dat’, -dit’ 
-(d)id n-, -d n-, -0 n- 
-da h- 

-(d)id’, -d’, -0’ 

dun, -dan(n) 

-dob, -dub, -dab 
-da h- 



Leaving aside for the moment the last set, which is limited to relative 
clauses introduced by a preposition (plus relative (s)a n-, with the sole ex- 
ception of i n- ‘in, in which’) and after certain conjunct particles such as dia 
n-, ma’ ‘if, when’, cia’ ‘though, unless’, ara n- ‘in order that’, co n- ‘so that’, 
and interrogative in n- (Pedersen 1913:145-7, Thurneysen 1946:258), it is 
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generally agreed that the distribution of the first two classes is determined by 
the (prehistoric) phonetic shape of the individual pre verbs. Those which 
ended in a vowel in Primitive Irish take class A, e.g., Wb. 5c6 m'm charat-sa 
‘they don’t love me’, 30d20 imma n-imcab ‘avoid him’, 15a7 na chomalnid- 
si ‘fulfill it’, 23d4 rob car-si 'he has loved you (pi.)’, 19d24 dos m-berthe ‘ye 
would have given them’. Preverbs which ended in a consonant, on the other 
hand, are found with class B pronouns: cf. Ml. 39c27 fritamm orcat ‘they 
offend me’, Wb. 6cl6 attot-aig ‘which impels you’. Ml. 112a3 cot n-erba ‘he 
will trust himself’, Wb. 3lcl6 fordon cain ‘teaches us’, 5a 13 ata samlibid-si 
‘you (pi.) will imitate them’. The preverbs associated with each class and 
their reconstructed Primitive Irish, Proto-Celtic, and Proto-Indo-European 
shapes are the following: 

Class A 

ar- < Primir. *ari < PC *<f)ari < PIE *prH-i 
di-, do- < Primir. *dT < PC, PIE *de 

do- < Primir. *tu < *tu < PC *to < PIE *to (Schrijver 1995:17fn.2) or < 
Primir., PC, PIE *to (OHitt. ta)^ 
fo- < Primir. *wo < PC *u<f)o < PIE *upo 

im(m)- < Primir. *imbi < PC *ambi < PIE *h 2 nt-b’’i (Jasanoff 1976; see 

Schrijver 1991 for raising of *a before nasal + voiced stop in pre- 
Olr.) 

neg. ni-, ni- < Primir. *ne < PC, PIE *ne 
no- < Primir. *no, *nu (?) < PC, PIE *nu 
ro- < Primir. *ro < PC *<f)ro < PIE *pro 

Class B 

ad-, -ad-l-aC-l-d- < PC, PIE *ad 

ad - , -aith-/-aid- < *ati^ 

con-, -com- < Primir., PC *kom < PIE *kom 



^See Schrijver (1995:17fn.2) for arguments in favor of a preform *tu. Note, 
however, that only *to is attested in Continental Celtic (J. Eska, p.c.), e.g., in Gaul. 
to=me=declai natina ‘(their) dear daughter set me up’ (Voltino; see fn. 23) and as a 
sentence connective in Celtib. ENIOROSEI VTA TIGINO TIATUNEI ERECAIAS TO 
LUGUEI ARAIANOM COMEIMV ‘To Eniorosis and to Tiatu of Tiginos the furrows, 
(and) to Lugus the farmland we dedicate’ (Penalba de Villastar; cf. Kddderitzsch 
1985:216, Eska 1990a:106-7). 

‘‘The class B infixed pronouns used with ad-, -aith-l-aid- < *ati are analogical to 
ad- < *ad. 
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as-, -ess-heC-he- < Primir., PC *exs < PIE *eK(s) 
eter-, -et(a)r- < Primir. *edder < PC *anter < (post-)PIE *n-ter (Lat. in- 
ter) 

for-, -for- < Primir. *wor *wer (probably on analogy of *wo ‘under’) 
< PC *u(|)er^ < PIE *uper 
fri-, -frith-1 -freC- < *writi 
in-, -in(d)- < Primir. *in < PC *en (?) < PIE *en 
as-, -OSS- < Primir. *uxs, *uss < PC *uxs, *uts < PIE *up(s) or *ud(s) 

It is highly surprising, then, that no explanation has yet been proposed for this 
clear phonological distribution. The standard handbooks call no special at- 
tention to these separate sets of infixed pronouns,^ and until recently (Schri- 
jver 1997:131-9) there have been, to my knov^ledge, no efforts to provide a 
common origin and/or historical account of their coexistence. 

Below I will first consider this problem from a purely phonological ap- 
proach (section 2), from which it follows that the combinations of the preverb 
+ infixed pronoun must originally have contained an intervening particle of 
the form *-(V)stV-. This reconstruction is strongly reminiscent of Cowgill’s 
suggestion of *esti as underlying the enclitic particle *-(e)s which he posited 
to explain the contrast between the Old Irish absolute and conjunct inflec- 
tions; the phonological problems raised by such a preform will be examined 
in section 3. In section 4, Old British relics of the absolute/conjunct verbal 
contrast are adduced as support for Old Irish clause-second *esti. Finally, I 
will outline the considerable implications of this hypothesis for the prehistory 
of the VSO syntax of Insular Celtic, and more generally for the evolution of 
Celtic constituent configuration (section 5). In particular, I will propose that 
all main clauses in declarative sentences were topicalized at the Proto- 
Insular-Celtic (PIC) stage, with *esti in C(omplementizer) position and a 
preverb or simple verb obligatorily fronted to Spec-CP. 

2 Phonological Reconstruction 

As already noted, earlier scholars, beginning with Thurneysen, described the 
occurrence of the class A and B infixed pronouns with their respective pre- 



^Probably on the analogy of *wo ‘under’; cf. Gaulish Ver-cingeto-rlx ‘super- 
hero-king’, Celtiberian ver-amos ‘leader’ (Schrijver 1995:120). 

^Cf. Pedersen (1913:147), Lewis and Pedersen (1937:198), Thurneysen 
(1946:257-8), Strachan (1949:26), all of whom merely state the distribution as fact 
and list the individual preverbs which take class A or B. 
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verbs and noted the (exact) correlation between choice of class and final 
segment of the Primitive Irish preform. 

Watkins (1963:26-8) suggests that an originally connective enclitic *de 
became fused with preverbs ending in a consonant, leading to the -d- of the 
class B pronouns, but retained its “quasi-independent status” after final vow- 
els, allowing a distinction between e.g., Isg. *ro-me (> class A rom-) and 
*ro-de-me (> class C rodom , restricted to relative clauses and eventually 
becoming generalized there at the expense of class A). This descriptive ac- 
count, however, fails to explain why sequences of consonant-final preverb + 
infixed pronoun, e.g., Isg. *kom-me, 3sg. m. *kom-em, were lost and re- 
placed by constructions with a particle that otherwise occurred only in rela- 
tive function. Though the sort of phonologically conditioned occurrence of 
particles or morphemes proposed by Watkins for pre-Old Irish *de is not 
unknown in the world’s languages,^ one would nonetheless prefer to seek 
some other origin for the observed distribution of class A and B endings 
without recourse to any ad hoc particles (or rather particles assumed to have 
followed an ad hoc pattern) at an earlier stage of the language. Most impor- 
tantly, Schrijver (1997:132-4) has emphasized that *de could not have given 
the -r-, -d- [-d-] of the class B forms by sound change. 

Let us approach the problem from a different, and apparently unrelated, 
area of Old Irish grammar, the verb. In his groundbreaking 1975 article on 
the absolute and conjunct verbal inflection of Insular Celtic and specifically 
Old Irish, Cowgill persuasively argues in favor of a derivation of conjunct 
forms from unsuffixed PIE primary endings, whereas absolute forms arose 
from the addition of a suffix *-(e)s in Wackernagel position after a clause- 
initial verb: e.g., 3sg. conj. heir < *beret < *bereti, abs. be(i)rid < *bereti+s 
(see also Cowgill 1985). Thurneysen (1914:29-30), who rejected Pedersen’s 
(1913:340-1) view that the absolute forms resulted from enclitic subject pro- 
nouns, noted that “gemination” after m ‘not’, i.e., m h- < *nTs(t) < *nTsti < 
*nesti < *ne esti, and other preverbs could be due to a postposed *s (see also 
Thurneysen 1946:152-3, 362-3). A particle of this shape explains the vast 
majority of attested endings in the OIr. absolute and conjunct paradigms. The 
lack of an obvious etymology carries little weight against such convincing 
phonological evidence, which itself must provide the basis for any etymo- 



^So, for instance, the modem Korean subject-marking suffix is realized as -i af- 
ter a consonant but -ka after a vowel, e.g., che-ka ‘I’ vs. ur-i ‘we’; as ~ka is not at- 
tested until the late 16th c., the two do not appear to share a common historical source 
(Lee 1977:251, 279). For another example cf. the distribution of Proto-Slavic *-no- 
and *-to- in OCS past passive participles: *-to- is found with unsuffixed sonorant- 
final and most semivoc^ic roots (e.g.Jqtu ‘seized’ to *jlm-, bitu *beat(en)’ to *blj-), 
*-no- elsewhere (Schenker 1993:106). 



